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Preface 


This book covers the basic theory of complex analysis and a selection of advanced top- 
ics. It evolved out of lecture notes from two quarter-long graduate classes that I taught 
several times at the University of California, Davis in 2016-2021. The book is primarily 
aimed at graduate students, advanced undergraduate students, and postgraduate math- 
ematics researchers. It is suited for self-study or as a primary reference material for 
approximately two semester-long graduate-level university courses. 

The advanced topics covered in Chapters 2-5 are classical and are discussed in many 
other places. It is my hope that my own exposition advances the pedagogy of the subject, 
if only ever so slightly, by simplifying the explanations, logical arguments, notation, etc, 
as much as it has been within my power to do. 

The last chapter, Chapter 6, is more modern in content and covers Maryna Via- 
zovska’s spectacular application of modular forms to the solution of the sphere pack- 
ing problem in dimension 8. Published in 2016, this work was until now only accessi- 
ble to learn about from the primary literature [71] and from a few expository papers 
[12, 13, 20, 52]. The detailed exposition of Viazovska’s work in Chapter 6, and the ac- 
companying Appendix A covering the relevant background material on sphere packing, 
should be useful to students and researchers wishing to get up to speed about these 
beautiful recent developments, which are at the forefront of much ongoing research. 

The choice of topics you will find in this work is idiosyncratic and reflects my own 
mathematical taste, interests, and biases. I make no claim that they are the most impor- 
tant parts of the vast theory that is complex analysis; only that they are beautiful, that 
they relate to many topics and theories that are of interest to a broad section of pure 
mathematicians, and that they are, broadly speaking, a fine set of mathematical ideas, 
one could devote one’s time to studying and thinking about. I hope some readers will 
agree. 

Iam grateful to Guy Kindler for help with the book cover design and to Christopher 
Alexander, Jennifer Brown, Brynn Caddel, Keith Conrad, Bo Long, Anthony Nguyen, Jian- 
ping Pan, and Brad Velasquez for helpful comments on versions of the lecture notes the 
book evolved from. 


Davis Dan Romik 
March 2023 
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0 Prerequisites and notation 


0.1 Prerequisites 


This book assumes knowledge of the following subjects, roughly at the level covered by 
advanced undergraduate courses in the United States: 

— Real analysis and multivariable calculus 

— Topology of R” (mostly for n = 2) 

— Complex numbers and their basic properties 

— The transcendental functions e’, sin z, cos z of a complex variable 


In a few places, some familiarity with Fourier analysis is needed to fully understand 
the material. Specifically, in Chapter 2 the Poisson summation formula is derived from 
basic properties of Fourier series, and this is used to prove some of the fundamental 
properties of the Riemann zeta function. Chapter 6 and Appendix A assume knowledge 
of the Fourier transform in R” and its basic properties. 

Starting in Chapter 3, and increasingly in Chapter 5, knowledge of the basic language 
of group theory may be needed to fully understand some of the topics being discussed. 
No results from group theory are used beyond the definition of a quotient group. 


0.2 Notation 


The following notation is used throughout the book. 

— R—the real numbers 

— C — the complex numbers 

— Z—the integers 

— i— the imaginary unit 

— Re(z) — the real part of a complex number z 

— Im(z)— the imaginary part of a complex number z 
— Z— the complex conjugate of a complex number z 
— |z|—the modulus of a complex number z 

— argz— the argument of a complex number z 

— Dp(z)— the open disc of radius R centered at z 

— Dep(z)— the closed disc of radius R centered at z 

— Cp(z) — the circle of radius R centered at z 

— cl(E)— the topological closure of a set E c C 

- D—the open unit disc D,(0) 

- H — the upper half-plane: {z € C : Im(z) > 0} 

- Q — a complex region (open and connected subset of C) 
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2 — 0 Prerequisites and notation 


Big-O notation and asymptotic equality. Ina few places, the standard big-O notation 
is used. Formally, the statement “F = O(G),” where F,G are complex-valued quantities 
that depend on one or more variables, means that |F| < C|G| when the variable or vari- 
ables in question range over some specified set of values (usually a neighborhood of 
some limiting point). Big-O expressions can also be combined in various ways in formu- 
las, e. g., “f (t) = O(e™) + O(t?) as t > œ” means that f(t) can be expressed as a sum of 
two quantities of the forms O(e) and O(t?), respectively, as t — oo. 

The statement F ~ G (read as “F is asymptotically equal to G”) means that F/G 
converges to 1 in some limiting sense, which is either specified explicitly or inferred 
from the context. For example, 


sin(x)~x asx —>0 


states an asymptotic equality, as does 


(2n)! _ 4” 
(n)? vmn 


as n — Oo. 


Exercises for Chapter 0 


Exercises forChapterO — 3 


0.1 Important formulas. Below there is a list of basic formulas in complex analysis. 
Review each of them, making sure that you understand what it says and why it is 
true; that is, if it is a theorem, then prove it, or if it is a definition, then make sure 


you understand it. 


In the formulas below, a,b, c, d, t,x, y denote arbitrary real numbers, and w, z de- 


note arbitrary complex numbers. 


a. Ê =-1 

b. (a+ bi)(c + di) 

= (ac — bd) + (ad + bc)i 
c =-İ 

d. z= Re(z) + ilm(z) 
e. Z= Re(z) - iIm(z) 
f. Re(z) = 2% 
5 

h 

i 


Z-z 
Im(z) = i 
iz? = zZ 
122Z 
z zł 
1 x-iy 


i 


sser 


2n apop 


wz| = |w] - |z] 
|w| - Izi] < |w + z| < |w| + |z| 
e% = e*(cos(y) + isin(y)) 


e7] = eRe(z) 

ež] < el 

et = cos(t) + isin(t) 
e"| = 1 it it 
cos(t) = 7 = 

sin(t) = a 

e™ = -1 

ermi/2 = +i 

eam =1 


0.2 Reminder of basic analysis concepts. Remind yourself of the definitions of the 
following terms in real and complex analysis and the topology of C, referring to 
other textbooks or online sources if necessary. 


real part 
imaginary part 
complex conjugate 
modulus 


a 

b 

c. 

d 

e. argument 
f. open set (in C) 
g. closed set 

h. closure 


connected set 


pm 


RBOB 


bounded set 
compact set 
region 


. convergent sequence 


Cauchy sequence 
limit point 
accumulation point 
continuous function 


1 Basic theory 


What is unpleasant here, and indeed directly to be objected to, is the use of complex numbers. w is 
surely fundamentally a real function. 


Erwin Schrödinger, June 6, 1926 letter to Hendrik Lorentz 


1.1 Motivation: why study complex analysis? 


This book is about complex analysis, the area of mathematics that studies holomorphic 
functions of a complex variable and their properties. Although this may sound a bit 
specialized, there are (at least) two excellent reasons why all mathematicians should 
learn about complex analysis. First, it is, in my humble opinion, one of the most beautiful 
areas of mathematics. One way of putting it is that complex analysis seems to have a very 
high ratio of theorems to definitions (i. e., a very low “entropy”): you get a lot more as 

“output” than you put in as “input.” 

The second reason is that complex analysis and, more generally, complex numbers, 
have a large number of applications in both the pure mathematics and applied math- 
ematics senses of the word. Moreover, many of these applications are to problems that 
a priori look like they ought to have little to do with complex numbers. Here are a few 
examples, including some that will be discussed later in the book: 

- Solving polynomial equations. In 1545, the Italian thinker Gerolamo Cardano pub- 
lished the famous formula for solving cubic equations, after learning of the solution 
found earlier by Scipione del Ferro. Historically, this appears to have been the first 
problem in mathematics to be solved using complex numbers. One surprising aspect 
of Cardano’s formula is that it sometimes requires taking operations in the complex 
plane as an intermediate step to get to the final answer, even when the cubic equation 
being solved has only real roots. 

— Proving asymptotic formulas. A well-known approximation to the factorial func- 
tion n! is given by Stirling’s formula, which states that the behavior of the factorial 
function for large values of n is given by 


n 


n! ~ am( 2) (1.1) 
e 


(using the notation of Section 0.2). Another famous asymptotic formula is the Hardy- 
Ramanujan formula, which states that the number p(n) of integer partitions of n 
behaves for large n like 


1 mv 2n/3 
Nn) ~ e i (1.2) 
p(n) m 
A standard approach to proving these types of results uses complex analysis, as dis- 
cussed, for example, in [28]. 
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1.1 Motivation: why study complex analysis? —— 5 


- Counting prime numbers. Let 7(n) denote the number of primes less than or equal 
to n. This function is known as the the prime-counting function. The prime num- 
ber theorem states that 


n 
m(n) ~ — aSn—ov. 
logn 


This is one of the most celebrated asymptotic formulas (and, indeed, one of the most 
famous theorems) in mathematics. Because it deals with prime numbers, it stands 
apart from the more general class of asymptotic formulas, such as (1.1)-(1.2) men- 
tioned above, and its proof requires more specialized techniques. A standard path to 
a proof of the prime number theorem goes through complex analysis, and this is the 
subject of Chapter 2. 

- Evaluation of complicated definite integrals. Complex analysis offers a set of tech- 
niques for evaluating definite integrals that are difficult or impossible to derive using 
standard calculus methods. An example is the integral 


CO 

[sin dt = AE 

} 2v2 
(known as one of the Fresnel integrals). See Exercise 1.47 at the end of this chapter 
for additional examples. 

- Solving partial differential equations. Complex-analytic techniques are very use- 
ful for solving several kinds of partial differential equation, particularly those arising 
in various applied physics problems in hydrodynamics, heat conduction, electrostat- 
ics, and more. 

— Analyzing alternating current electrical networks. Electrical engineers learn that 
the usefulness of Ohm’s law can be greatly extended by generalizing the notion of 
electrical resistance to that of electrical impedance, a complex-valued quantity. 
Complex analysis also has many other important applications in electrical engineer- 
ing, signal processing, and control theory. 

— Solution of the sphere packing problem in 8 and 24 dimensions. It was proved in 
2016 that the optimal densities for packing unit spheres in 8 and 24 dimensions are 
te and . respectively. The proofs make use of complex analysis in a fundamental 
way. The proof for the case of 8 dimensions is presented in Chapter 6. 

- Applications in probability and combinatorics. Over the last few decades, com- 
plex analysis has been applied in spectacular ways to prove asymptotic results 
in probability and combinatorics. One such application is a proof of the Cardy- 
Smirnov formula in percolation theory, which answers the following question: 
consider a parallelogram-shaped section of cells in the honeycomb lattice with m 
rows of cells, each containing n cells. Each cell is colored either black or white ac- 
cording to the outcome of a fair coin toss, independently of all other cells (Fig. 1.1(a)). 
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Figure 1.1: (a) Percolation on a honeycomb: the Cardy-Smirnov formula gives the asymptotic probability 
of a left-to-right crossing event. In this sample configuration, a left-to-right crossing has occurred, as illus- 
trated by the trail of red dots representing one possible crossing path. (b) A self-avoiding walk of length 45 
on the hexagonal lattice. 


A left-to-right crossing event is the event that we can find a contiguous path of 
white-colored cells connecting the left edge of the parallelogram to the right edge. 
What is the asymptotic probability of this event in the limit as the side lengths of 
the parallelogram grow to infinity but its shape tends toward a parallelogram with 
a fixed aspect ratio? 

Specifically, let P(m, n) denote the probability of a left-to-right crossing event. Cardy 
conjectured [10] and Smirnov proved [64] the following result. 


Theorem 1.1 (Cardy-Smirnov formula). As m,n — oo with the aspect ratio m/n con- 
verging to a fixed value A € (0, co), the probabilities P(m, n) have the limiting behavior 


m/n>A 
for an explicit function ®(A). 


A detailed account of Smirnov’s proof can be found in [34, 73]. The function ®(A) is 
most naturally defined as a certain geometric invariant associated with the parallel- 
ogram with corners 0, 1, (i )A, and (8 +1and can be written down explicitly 
in terms of modular forms [43] and other special functions from complex analysis. 

A second example of a recent application of complex analysis to probability and com- 
binatorics is the evaluation of the connective constant of the hexagonal lattice. Let 
Cn denote the number of self-avoiding walks of length n in the hexagonal lattice that 
start at the origin; that is, hexagonal lattice paths that do not intersect themselves; 
see Fig. 1.1(b). Without the condition of the path being self-avoiding, the number of 
such paths would be exactly equal to 3”. The sequence (c,)°°,, with initial values 
1, 3, 6, 12, 24, 48, 90, 174, 336, . . . [W1], is much more mysterious, and its rate of growth 
(as wellas the rates of growth of similar sequences associated with the square lattice 
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and other natural lattices) have been the subject of much study. From general consid- 
erations it is fairly easy to see that the sequence grows roughly exponentially, that is, 
there exists a constant u > 0 such that cll n _, wasn — oo. The constant u is known 
as the connective constant of the hexagonal lattice. Nienhuis [51] conjectured in 1982 
and Duminil-Copin and Smirnov [65] proved in 2010 the following remarkable result 


concerning the value of u. 


Theorem 1.2 (Duminil-Copin-Smirnov theorem). The connective constant of self-avoi- 


ding walks in the hexagonal lattice is equal to y 2 + V2 = 1.84776, that is, the numbers 
Cn satisfy 


lim cH" = y2 + v2. 


— Running the universe. Nature uses complex numbers in the fundamental laws of 
physics, Schrédinger’s equation and quantum field theory. This is not a mere math- 
ematical convenience or sleight-of-hand, but appears to be a built-in feature of the 
very equations describing our physical universe. Why? No one knows. (But it is a 
fun topic for debate; see, e. g., [42], [W2], [W3].) 

- Conformal maps. A conformal map is a mapping from one planar region to another 
that preserves angles. This notion, which comes up in purely geometric applications 
where the algebraic or analytic structure of complex numbers seems irrelevant, are 
in fact deeply tied to complex analysis. Conformal maps were used by the Dutch artist 
M.C. Escher (though he had no formal mathematical training) to create amazing art 
and used by others to better understand, and even to improve on, Escher’s work. See 
Fig. 1.2 and [21, 59] for more on the connection of Escher’s work to mathematics. We 
discuss conformal maps in detail in Chapter 3. 

— Proving number-theoretic identities. Lagrange proved in 1770 a classic result in 
number theory, which states that every positive integer can be represented as asum 
of four squares of integers. Jacobi later proved a more precise fact: if we denote by 
r4(n) the number of distinct ways in which a positive integer n can be represented as 
a sum of four squares (with different orderings counting as distinct), then we have 
the remarkable identity 


r(nh=8 È d (1.3) 
din, 44d 


(In words: eight times the sum of divisors of n that are not divisible by 4.) This beau- 
tiful identity and many others like it with a number-theoretic flavor can be proved 


1 Schrödinger himself appeared dissatisfied with the idea that his equation uses complex numbers to 
describe physical reality. See the epigraph at the beginning of this chapter and [42] for further discussion. 
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Figure 1.2: The photo is only available in the printed edition. 


using complex analysis; see Chapter 5 (and Exercise 5.21 at the end of that chapter 
for the particular application to proving (1.3). 

- Complex dynamics. Iteration of complex-analytic maps can be used to generate 
beautiful fractals with remarkable properties. A famous example is the iconic Man- 
delbrot set (Fig. 1.3) defined as the set of complex numbers c € C for which the 
sequence of functional iterates f0) of the map f,(z) = zê + c starting from the 
point z = 0 remains bounded. 


This has been just a short and necessarily very incomplete survey on the importance 
of complex analysis. There are many other intriguing applications and connections of 
complex analysis to other areas of mathematics. 

In the next section, I will begin our journey into the subject by proving a famous 
theorem about polynomials over the complex numbers. 


1.2 The fundamental theorem of algebra —— 9 


(a) (b) 


Figure 1.3: (a) The Mandelbrot set. (b) Magnified details of a small region. 


1.2 The fundamental theorem of algebra 


One of the most famous results about complex numbers is the fundamental theorem of 
algebra. Although the statement of the theorem is indeed very fundamental to algebra, 
most of its known proofs rely on complex analysis in an essential way. Looking at a few 
of these proofs seems like a fitting place to start our journey into the theory. 


Theorem 1.3 (Fundamental theorem of algebra). Every nonconstant polynomial 
D(Z) = pz" +n 42" +-+-+Qy (n21) (1.4) 


with complex coefficients has a complex root. 


The fundamental theorem of algebra is a striking and subtle result and has many 
beautiful proofs. I will show you three of them. 


First proof: analytic proof. Let p(z) be as in (1.4), and consider where |p(z)| attains its 
infimum. 
First, note that the infimum cannot be attained as |z| — oo, since 


|p(z)| = Izi” - (Jan + Gg eee +--+ az "|) 
and, in particular, 


__ |p(Z)l 
Jim, pr > xh (15) 


10 —— 1 Basic theory 


so for large |z], it is guaranteed that |p(z)| > |p(0)| = |ag|. Now fix some radius R > 0 for 
which |z| > R implies |p(z)| > |ag|, and choose a complex number Zgo in the disc Dp(0) 
for which |p(Zo)| = min,)<z |p(Z)|. (The minimum exists because p(z) is a continuous 
function on the disc.) We then have that 


mo ‘= inf|p(2)| = inf |p(@)| = min|p(2)| = |p@o)|. 


Denote wọ = p(Zp), so that Mp = |Wo|. We now claim that mọ = 0. Indeed, assume by 
contradiction that this is not the case. The idea is now to examine the local behavior of 
p(z) around Zg. Expanding p(z) in powers of Z — Zo, we can write 


n . 
p(z) = wo + Y GZ - Zo) 
j= 
for some complex coefficients c,,...,C,. This can also be written as 
p(Z) = Wo + Cx (Z — Zo) + +++ + Cq(Z— Zo)", (1.6) 


where we denote by k the minimal positive index for which c; + 0. Now imagine starting 
at the initial point z = z, and then making a small perturbation away from zọ in the 
direction of some unit vector e”. We estimate the way that such a perturbation affects 
the value p(z). Expansion (1.6) gives 

pz) ere") = wo tere” veg gr ee ee re, (1.7) 
When r (the magnitude of the perturbation) is very small, the power r* dominates the 
other terms 7’ with k < j < n; that is, (1.7) can be rewritten as 


p(Zp + re”) = wo +r" (cpe? + cpare ™®? +... + ere") 
= wo + cpr“ e? (1 + g(r, 0)), (1.8) 
where we denote 
= O jk ij-00 
g(r, 0) = > Ire . 
Cy 
j=k+1 
Note that g(r, 0) satisfies a bound of the form 
|g(r, 0)| < Ar (1.9) 


for allr € [0,1] and some constant A > 0. 


To reach a contradiction, we now choose @, the angle of the perturbation, to be such 


that the vector cpr“ eX «points in the opposite direction” from Wg, that is, such that 
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ase ae 
Wo 

This is clearly possible: take 0 = (arg Wo — arg(c;,) +77). The idea in doing this is that for 
this choice of 0, the expression wọ + cgr¥eŤ® that forms the dominant term in (1.8) will 
have a smaller magnitude than wy ifr is chosen small enough. 

To make this precise, choose a number r € [0,1] smaller than the minimum of the 
two numbers 1/(2A) (where A is the constant in (1.9)) and (|wo|/ lex)” K This choice en- 
sures the two inequalities 


[cpr e°] < iwọ] and |g(r,0)| < 5. 


With those choices for @ and r, we have that 


IP(Zo + re®)| = |wo + cgr“ e”? (1 + g(r, 0))| < |W + cpr“e”®] + [epr gtr, 0) 


k k 1 k 
= |wọl — lel + lcxlr lg(r, 0)| < |Wol - 5lexlr < Iwo] = |P(Zo)l. 


This is in contradiction to the defining property of Zz) and completes the proof. 


Second proof: topological proof. If the constant coefficient a, = p(0) of p(z) is equal to 0, 
then we are done, since 0 is a complex root of p(z). Otherwise, consider the image under 
p of the circle |z| = r. Note that, on the one hand, for sufficiently small values of r, the 
image is contained in a neighborhood of wg, so it cannot “go around” the origin. 

On the other hand, for r very large, we have 


p(re®) = ar" +nie y... y Ope Pe) 
a 
n n 


apr”e"®(1 + h(r,0)), 


where h(r, 0) is a function that satisfies lim,_,,, h(r, 0) = 0 (uniformly in 0). As 0 goes 
from 0 to 27, this is a closed curve that goes around the origin n times (in an approxi- 
mately circular path, which becomes closer and closer to a circle as r > oo). 

As we gradually increase r from 0 to a very large number, to transition from a curve 
that does not go around the origin to a curve that goes around the origin n times, there 
has to be a value of r for which the curve crosses 0. This means that the circle |z| = r 
contains a point z such that p(z) = 0, which was the claim. 


The argument presented in the topological proof is imprecise. It can be made rigor- 
ous in a couple of ways—one way we will see a bit later is using Rouché’s theorem (see 
Section 1.13 and Exercise 1.30 at the end of the chapter). The difficulty of making these 
sorts of arguments precise, in spite of their appealing intuitive nature, gives a hint as to 
the importance of subtle topological arguments in complex analysis. 
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As another remark, the topological proof should be compared to the standard calcu- 
lus proof that any odd-degree polynomial over the reals has a real root. That argument 
is also “topological”, although much more elementary. 


Third proof: typical textbook proof (or: “hocus-pocus” proof). This is a one-liner of a 
proof that assumes some complex analysis knowledge. Recall that an entire function is 
afunctionf : C > C that is everywhere holomorphic. Recall the well-known Liouville’s 
theorem, which states that any bounded entire function is constant. 

Assuming this result (which we will prove in Section 1.9), if p(z) is a polynomial 
with no root, then 1/p(z) is an entire function. Moreover, it is bounded, since our earlier 
observation (1.5) implies that lim,,_,,. 1/p(Z) = 0. By Liouville’s theorem it follows that 
1/p(z) is a constant, which then has to be 0, leading to a contradiction. 


To summarize this section, we saw three proofs of the fundamental theorem of al- 
gebra. They are all beautiful—the “hocus-pocus” proof certainly packs a punch, which is 
why it is a favorite of complex analysis textbooks—but personally I like the first one best 
since it is fully rigorous while being completely elementary and not requiring the use 
of either Cauchy’s theorem or any of its consequences, or of subtle topological concepts. 
Moreover, it employs a “local” argument based on understanding how a polynomial be- 
haves locally, where by contrast the other two proofs can be characterized as “global.” 
It is a general principle in mathematical analysis (that has analogies in other areas of 
mathematics, such as number theory and graph theory) that local arguments are con- 
ceptually easier than global ones. 


Suggested exercises for Section 1.2. 1.1, 1.2. 


1.3 Holomorphicity, conformality, and the Cauchy-Riemann 
equations 


In this section, we begin to build the theory in a systematic way by laying its most basic 
cornerstone, the definition of holomorphicity, along with some of the useful ways to 
think about this fundamental concept. 


1.3.1 Definition of holomorphicity 


A function f(z) of a complex variable is called holomorphic at Z, if the limit 


lim —— (1.10) 


h-0 
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exists. In this case, we denote this limit by f’ (Zọ) and call it the derivative of f at z}. A 
function of a complex variable defined on all of the complex plane that is everywhere 
holomorphic is called an entire function. 

The terms analytic, differentiable, and complex-differentiable are synonyms for 
“holomorphic.” Some books will make a somewhat pedantic distinction between “ana- 
lytic” and “holomorphic” as two distinct concepts that are defined in a priori different 
ways but are then shown to be equivalent soon afterward, at which point the distinction 
ceases to have any real importance. In this book, we do not follow that approach. 

The following are basic properties of complex derivatives. 


Lemma 1.4. Under appropriate assumptions (see Exercise 1.4), we have the relations 


F +g) (z) =f'(z) + g'(z), (1.11) 
(fg)'(z) = f' (z)g(z) + f (2)g' (2), (1.12) 
1 .f@ 

=) =~, 1.13 
€ ) f(z)? I) 
(£) f OA- TAE (2) (1.14) 

d g(z) 
F ° 8) (2) = f' (g(z))g'(z). (1.15) 


Proof. Exercise 1.4. 


The concept of the derivative in complex analysis is clearly at the heart of the sub- 
ject, and there are several helpful ways to think about its meaning. Assume that f(z) 
is holomorphic at Zp. In the discussion below, we make the further assumption that 
f' (Zo) # 0. 


1.3.2 First interpretation of holomorphicity: local geometric behavior 


If we write the polar decomposition f’ (Zg) = re? of the derivative, then for points z that 
are close to Zp, we will have the approximate equality 


f) -f Zo) 


fl — rol 
ae =f (Zo) =re 
or, equivalently, 
f (2) =f(Z) + re? (z — Zo) + [lower-order terms], 


where “lower-order terms” refers to a quantity that is much smaller in magnitude that 
|Z — Zol when z is close to Zọ. Geometrically, this means that to compute f(z), we start 
from f(Z)) and move by a vector that results by taking the displacement vector Z — Zo, 


14 —— 1 Basic theory 


rotating it by an angle of 0, and then scaling it by a factor of r (which corresponds to a 
magnification ifr > 1, a shrinking if 0 < r < 1, or no scaling if r = 1). This idea can be 
summarized by the slogan: 


Holomorphic functions behave locally as a rotation composed with a scaling. 


The local behavior of analytic functions in the case f'(z) = 0 is more subtle; see Sec- 
tion 1.16. 


1.3.3 Second interpretation of holomorphicity: the Cauchy-Riemann equations 


Next, we interpret holomorphicity from the point of view of real analysis. Remembering 
that complex numbers are vectors that have real and imaginary components, we can 
denote z = x +iy, where x and y are the real and imaginary parts of the complex number 
z,and f = u + iv, where u and v are real-valued functions of z (or, equivalently, of x and 
y) that return the real and imaginary parts, respectively, of f. Now if f is holomorphic 
at z, then the limit (1.10) exists as a complex limit, that is, independently of the way h 
approaches 0 as a complex number. In particular, we can evaluate the limit in two ways 
by considering two specific ways of letting h approach 0, as a pure real number or as a 
pure imaginary number. For the first of those possibilities, we have 


roy qa f(Z+h) -fa 
Te at h 


— 


u(x + h + iy) - u(x + iy) 4, Ua tht by) -v +i) 
h—0, heR h h 


Similarly, for the second method of approaching 0, we get that 


f(z+h) -f(z) 
h 


f 2) = lim 


u(x + h + iy) - u(x + iy) 4, Ut ht) -v + iy) 


J h—0, heiR h h 
- lim u(x + iy + ih) -ux + iy) ver + iy + ih) — v(x + iy) 
h—0, heR ih ih 


jou ji OV ¡2u 
dy dy əy oy’ 


Since these limits are equal, by equating their real and imaginary parts we get a cele- 
brated system of partial differential equations, the Cauchy—Riemann equations: 


ou z A ov 2 ou (1.16) 
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We have proved that iff is holomorphic at z = x + iy, then the components u and v of 
f satisfy the Cauchy—Riemann equations (1.16). A kind of converse to this is also true but 
requires additional assumptions. Assume that f = u + iv is continuously differentiable 
at z = x + ly (in the sense that each of u and v is a continuously differentiable function 
of x,y as defined in ordinary real analysis) and satisfies the Cauchy—Riemann equations 
there. This implies that f has a differential at z; that is, in the notation of vector calculus, 
if we denote f, z, and Az as the column vectors 


r(e). eG). C) 


then we have 


where E(Az) is a function of Az that satisfies 


lim 2)! -0 
Az—=0 |AZ]| 


Now by the assumption that the Cauchy—Riemann equations hold, we also have 
ðu ðu ðu ðu 
C 
n) f) ~ ou ou i 
z 7 h, -yM ag hz 


which is the vector calculus notation for the complex number 


ðu ðu ðu ðu 
— —i— }(h, +ih) =| — -i> JAz. 
C iE ) a + iho) (= i)a 


So we have shown that (again, in complex analysis notation) 


jim £22 4D -F@ _ him 
Az—0 AZ Az—0 


(= i a) jou 
ox dy Az } ox a 


This proves that f is holomorphic at z with derivative given by f’(z) = ou - is: We 
summarize the above discussion with the following proposition. 


Proposition 1.5 (Cauchy-Riemann equations). Let f = u + iv be a function of a complex 
variable z with real and imaginary parts u and v, respectively. If f is holomorphic at z, 
then the Cauchy—Riemann equations (1.16) are satisfied at z. Conversely, if equations (1.16) 
are satisfied at z and if u and v are continuously differentiable functions at z, then f is 
holomorphic at z. 
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1.3.4 Third interpretation of holomorphicity: conformal maps 


Going back to a more geometric way of thinking about holomorphicity, a further inter- 
pretation of the meaning of this property is that holomorphic functions are conformal 
mappings where their derivatives do not vanish. More precisely, assume as before that 
f(z) is holomorphic at zp) and f'(Zọ) + 0. Let y4, yz : (a,b) — C be two differentiable 
parameterized planar curves defined on some interval (a, b) containing 0, such that 
y,(0) = y,(0) = Zp. The tangent vectors to the curves y, and y, at Zọ are the complex 
numbers v; and v, defined by 


vı =y1(0), vz = ¥3(0). (1.17) 
Similarly, the tangent vectors to the curves f o y, and f o° y, at f (Zọ) are 
w = (f ° y1)'(0) w= (f ° y2)' (0), 


which, by a version of the chain rule from vector calculus adapted to complex-analytic 
notation (Exercise 1.6), can be rewritten as 


w = f’ (y1(0))y1(0) = f'(Zo)y;(0), (1.18) 
w3 = f'(y2(0))y3(0) = f" (Zo)y3(0). (1.19) 
It follows that we can write the inner products (in the ordinary sense of planar vector 
geometry) between the complex number pairs vy, v. and w4, W, as 
(vi; V2) = Re(v,V2), 
(w1, w2) = Re(w,W2) = Re((f’ (Zo), (0))(F" (Zo) y3(0))) 
= f' (Zf! (Zo) Re(v,V2) = MOKA v3). (1.20) 


If we denote by 0 and ọ the angle between vj, v, and the angle between w4, w, respec- 
tively, we then get using (1.17)-(1.20) that 


(wi W2) _ F'ola V) _ Voy? 


= 0. 
millwa P'o If (Zov  Ivil [val gá 


cos @ = 


So we have shown that under the assumption that f’ (Zo) + 0, the function f(z) maps two 
curves meeting at an angle 0 at Zo to two curves that meet at the same angle at f (Zo). A 
function with this property is said to be conformal at Zg; see Fig. 1.4. 

We can also prove that, under additional assumptions, the converse to the fact that 
holomorphicity with a nonvanishing derivative implies conformality also holds, making 
holomorphicity and conformality into nearly equivalent concepts. An important addi- 
tional condition is that the conformal map needs to be orientation-preserving; this 
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for 


Figure 1.4: A conformal map f preserves the angle between curves crossing at a point: 6 = 9. 


condition can be seen to be necessary by considering the map f(z) = Z, which is con- 
formal but not holomorphic. Recall from vector calculus that for a differentiable vector 
planar map f : U > R? (where U is some open set in Rĉ), the Jacobian matrix of f is 


the matrix of partial derivatives, 
ðu Ou 
x ə 
Ip = k > l (1.21) 
ox ay 
If det J; > 0, then we say that f preserves orientation. 


Theorem 1.6. Iff = u + iv is holomorphic at zo and f'(z)) + 0, then f is conformal at 
Zp. Conversely, iff is conformal at Zp, continuously differentiable at Z, in the real analysis 
sense, and preserves orientation at Zo, then f is holomorphic at Zp. 


The first claim of the theorem was already proved above. The converse direction is 
proved with the help of the Cauchy—Riemann equations. First, we will need the following 
simple lemma about linear transformations in the plane. 


Lemma 1.7 (Linear conformal maps). Assume that A = (44) is a2 x 2 real matrix. The 

following are equivalent: 

(a) A preserves orientation (that is, det A > 0) and is a linear conformal map, that is, 
satisfies 


(Aw, Aw) _ (Wy, W2) 


2 
|Aw,||Aw,| yl lwo) (wy, w € R° \ {(0,0)}). (1.22) 


(b) A takes the form 


A= ( n ] for somea,b € R with foe br > 0. (1.23) 
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(c) A takes the form 


er ee -sin 


or somer > Oand 9 c R. 
sin 8 a] fi 


(That is, geometrically A acts by a rotation followed by a scaling.) 


Proof that (a) => (b). Note that both columns of A are nonzero vectors by the assump- 
tion that det A > 0. Now applying assumption (1.22) with w; = (1, 0)", w, = (0,1)" yields 
that (a,c) L (b, d), so that we must have 


(b, d) = K(-c, a) (1.24) 


for some x € IR\ {0}. On the other hand, applying (1.22) with w; = (1,1)' and w, = (1,-1)" 
yields that (a+ b,c +d) L (a — b, c — d), whichis easily seen to be equivalent to a” +c” = 
b? + d’. When combined with (1.24), this implies that x = +1. So A is of one of the two 
forms (4 £ ) or (£ £). Finally, the assumption that det A > 0 means that it is the first of 


those two possibilities that must occur. 


Proof of the implications (b) <= (c) and (b) => (a). This is left as an exercise (Exer- 
cise 1.7). 


Proof of Theorem 1.6. Assume that f is conformal, continuously differentiable, and 
orientation-preserving at Zp. Let y : (a,b) — C bea differentiable parameterized planar 
curve with 0 e€ (a,b), y(0) = Zo, and tangent vector v = y'(0) at zo. By standard prop- 
erties of differentiable planar maps the tangent vector of f - y at f (wo) is Jr(Zo)v (that 
is, the Jacobian matrix of f at z, acting as a linear map on the vector v, interpreted as 
a column vector). This means that f is conformal at zy if and only if the matrix J;(Z)) is 
a linear conformal map in the sense of satisfying condition (1.22) in Lemma 1.7(a). Now 
adding the knowledge that f is orientation-preserving at Zo, the equivalence stated in 
the lemma implies that J (Zo) must be of the form given on the right-hand side of (1.23). 
Comparing that form with (1.21), we see that this precisely means that f satisfies the 
Cauchy—Riemann equations at Z). This means that the converse part of Proposition 1.5 
applies, and we conclude that f is holomorphic at Zo, as claimed. 


Suggested exercises for Section 1.3. 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 1.10, 1.11. 


1.4 Additional consequences of the Cauchy-Riemann equations 


In the previous section, we saw that the Cauchy—Riemann equations can be used to 
prove the near-equivalence between holomorphicity with a nonvanishing derivative 
and conformality. Another curious consequence of the Cauchy—Riemann equations, 
which gives an alternative geometric picture to that of conformality, is that holomor- 
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phicity implies the orthogonality of the level curves of u and of v. That is, if f = u + iv is 
holomorphic, then 

(Vu, Vv) = ( (Uy, Uy), (Vys Vy) ) = UyVy + UyVy = VyVy — VyVy = 0. 
Since Vu (resp., Vv) is orthogonal to the level curve {u = c} (resp., the level curve {v = d}), 
this proves that the level curves {u = c} and {v = d} meet at right angles whenever they 
intersect. 

Yet another important and remarkable consequence of the Cauchy—Riemann equa- 
tions is that, at least under mild smoothness assumptions (which, as we will see later, can 
be removed) in addition to holomorphicity, u and v are harmonic functions. Assume 
that f is holomorphic at z and is twice continuously differentiable (in the real analysis 
sense) there. Then 


u du da/dau\. d/du 

ott ot alaia) 
_dafawy\ afa) dv Caer 
-2(2) z(2) axdy dyox — 


i. e., u satisfies Laplace’s equation 


Au = 0, 


2 2 
where A = 2 + zz is the two-dimensional Laplacian operator. A function that satisfies 


this equation is called a harmonic function. Similarly (check), v also satisfies 


Av = ov + fy = 

ax? Əy? 
So we have shown that u and v are harmonic functions. This fact is an important con- 
nection between complex analysis, real analysis, and the theory of partial differential 
equations. 

We will later see that the assumption of f being twice continuously differentiable is 
unnecessary, but proving this requires more advanced ideas (see Theorem 1.30 in Sec- 
tion 1.9). 

A final remark related to holomorphicity and the Cauchy—Riemann equations is the 
observation that if f = u + iv is holomorphic, then its Jacobian matrix is given by 


Jp = det G ») = UyVy — UyVy = U% + Vy = [Uy + ivl = oF. (1.25) 
x Vy 


This can also be understood geometrically—spend a moment thinking what the geomet- 
ric interpretation is. 


Suggested exercises for Section 1.4. 1.12. 
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1.5 Power series 


Until now we have not discussed any specific examples of functions of a complex vari- 
able. Of course, there are the standard functions that you probably encountered already 
in your undergraduate studies: polynomials, rational functions, e”, the trigonometric 
functions, etc. Aside from these examples, it would be useful to have a general way 
to construct a large family of functions. Of course, there is such a way: power series, 
which—nonobviously—turn out to be essentially as general a family of functions as one 
could hope for. 

To make things precise, a power series is a function of a complex variable z defined 


by 
f@= È a,(z - 29)", (1.26) 
n=0 


where Zo € C, and (a,)°y is a sequence of complex numbers. This function is defined 
wherever the respective series converges. 

For which values of z does this formula make sense? Define the number R € [0, co] 
as 


R= (lim sup lanl""") : 


which we refer to as the radius of convergence of the power series. Its significance is 
explained in the following simple result. 


Lemma 1.8. 1. The series (1.26) converges absolutely if |z — Zo| < R. 
2. The series (1.26) diverges for all z satisfying |Z — Zo| > R. 


Proof: We assume that 0 < R < oo; the edge cases R = 0 and R = œ are left as an 
exercise (Exercise 1.13). The defining property of R is that for all e > 0, we have that 
lanl < ($ +e)" if nis large enough, and R is the maximal number with that property. Let 
Z € Dp(0). Since |z| < R, we have Iziė +€) < 1 for some fixed e > 0 chosen small enough. 
This implies that for all n > N (for some large enough N that depends on e), 


foe) foe) 1 n 
2 |a,z"| < 2 (3 +e Jiz] y 
n=N n=N 
so the series is dominated by a convergent geometric series and hence converges. 
Conversely, if |z| > R, then Izi(ė - €) > 1for some small enough fixed e > 0. Taking a 
subsequence (Qn, Jeet for which lan, | > ( j —¢)"* for all k (such a subsequence exists by 
the definition of R), we see that 


1 i 
|an, z™] 2 a = e)| >1, 
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that is, the power series (1.26) contains infinitely many terms with modulus > 1 and 
hence diverges. 


Another important property of power series is given in the following theorem. 


Theorem 1.9 (Power series are holomorphic). Power series are holomorphic functions in 
the interior of their disc of convergence and can be differentiated termwise there; that is, 
the derivative of the infinite series is equal to the series of the derivatives. 


Proof. Denote 
f(2) = È aZ” = Sy(z) + Ry(z), where 
n=0 


N 00 
Sy(2) = $ az", Ry(z)= Yo az", 
n=0 n=N+1 


and let 
g(z) = $ NdyZ" 
n=1 


The claim is that f is differentiable on the disc of convergence and that its derivative is 
the power series g. Since nv” _,1asn > o9, itis easy to see that f(z) and g(z) have the 
same radius of convergence. Fix Zo with |Zo] < r < R. We wish to show that PEA 
converges to g(Zọ) as h — 0. Observe that 


h) - S h) -S 1 
a e E a) 
R Ry (Zo =F a — Ry (Zo) fa (Sy (Zo) -— g(Zo)). (1.27) 


In this last expression, the first term converges to 0 as h — 0 for any fixed N. To bound 
the second term, fix some e > 0, and assume that |h| < r, and moreover that |h| is small 
enough so that |Zo + h| < r. Now make use of the algebraic identity 


p"-q" = (p -D + pq +- p + a) 


to get that 


CO 


< > lal 


Ry (Zo + h) - Ry (Zo) 
h 


(Zo + h)” - z9 


1 
Mm 
J 


IA 
M 
A 
2a 
z 
iar 
z3 
L 
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The last expression in this chain of inequalities is the tail of an absolutely convergent 
series, so it can be made < e be taking N large enough (before taking the limit as h — 0). 

Third, we have the limit Sy(Zo) — g(Z)) as N — oo, so we can choose N large 
enough so that |S}(Zo) — g(Zo)| < €. Having thus chosen N, we get finally from (1.27) and 
the above estimates that 


lim su 
h-0 


CERO E sore rente 


Since e was an arbitrary positive number, this shows that fant > g(Zy) ash > 0, 
as claimed. 


Corollary 1.10. Holomorphic functions defined as power series are differentiable (in the 
complex-analytic sense) infinitely many times in the disc of convergence. 


Corollary 1.11. For a power series g(z) = Y% o An(Z — Zo)” with positive radius of conver- 
gence, we have 


(n) 
a, = E Zo) . 


n (1.28) 
n! 


In other words, g(z) satisfies Taylor’s formula 


œ (n) 
go = Y g-a". 


n=0 


Suggested exercises for Section 1.5. 1.13, 1.14, 1.15, 1.16. 


1.6 Contour integrals 


We now introduce contour integrals, which are another fundamental building block 
of the theory. 

Contour integrals, like many other types of integrals, take as input a function to be 
integrated and a “thing” (or “place”) over which the function is integrated. In the case of 
contour integrals, the “thing” is a contour, which is (for our current purposes at least) a 
kind of planar curve. We start by developing some terminology to discuss such objects. 
A parameterized curve is a continuous function y : [a,b] — C. The value y(a) is called 
the starting point, and y(b) is called the ending point (both a, b together are referred 
to as the endpoints). Two curves y, : [a,b] > C, y, : [c,d] — C are called equivalent, 
denoted yı ~ yo, if y(t) = yı(I(t)) where I : [c,d] — [a,b] is a continuous, one-to-one, 
onto, increasing function. A curve y is called simple if it does not intersect itself, that is, 
if y is injective. It is called closed if y(a) = y(b). 
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What we will refer to as a curve is, formally speaking, an equivalence class of pa- 
rameterized curves with respect to the equivalence relation defined above. We also use 
the word contour as a synonym for curve. 

In practice, we will usually refer to parameterized curves simply as “curves,” which 
is the usual abuse of terminology that one sees in various places in mathematics, in 
which one blurs the distinction between equivalence classes and their members, re- 
membering that various definitions, notation, and proof arguments need to “respect 
the equivalence” in the sense that they do not depend of the choice of a member. (As a 
meta exercise, try to think of other examples of this phenomenon you might have en- 
countered in your studies.) 

For our present context of developing the theory of complex analysis, we will as- 
sume that all our curves are piecewise continuously differentiable. More generally, we 
can assume them to be rectifiable, but we will not bother to develop that theory. There 
are yet more general contexts in which allowing curves to be merely continuous is ben- 
eficial (and indeed some of the ideas we will develop in a complex-analytic context can 
be carried over to that more general setting), but we will not pursue such distractions 
either. 

You probably encountered curves and parameterized curves in your earlier studies 
of multivariate calculus, where they were used to define the notion of line integrals 
of vector and scalar fields. Recall that there are two types of line integrals, which are 
referred to as line integrals of the first and second kind. The line integral of the first 
kind of a scalar (usually real-valued) function u(z) over a curve y is defined as 


n 
ds= lim )AS;, 1.29 
| u(z) ee 2 u(z;)As (1.29) 
y j j=1 
where the limit is a limit of Riemann sums with respect to a family of tagged partitions of 
the interval [a, b] over which the curve y is defined as the norm of the partitions shrinks 


to 0. Such a partition consists of partition points 
Q=ty<t<---<t,=)D, 


and each partition subinterval [t;—4, tj] is “tagged” or marked with an arbitrary point 7; 
chosen from the subinterval. Given this partition, we denote z; = y(z;), and the symbols 
As; refer to finite line elements, namely As; = |z; — Zj_;|. This notation gives meaning to 
the right-hand side of (1.29). 

The line integral of the second kind is defined for a vector field F = (P,Q) (using 
the more traditional notation from calculus; in the complex analysis context, we would 
regard this object as the complex-valued function F = P + iQ) by 


[F-ds= | Pax + Qdy- 


n 
y y T 


m 
J 
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where the numbers z; are associated with the tagged partition as above, and 


Xj = Re(z;), Yj = Im(z;), AX; = Xj = Xj- Ay; = Yj — Yj-1- 


It is well known from calculus that line integrals can be expressed in terms of ordi- 
nary (single-variable) Riemann integrals. Take a couple of minutes to remind yourself 
of why the following formulas are true (assuming that all the functions involved are 
piecewise continuously differentiable): 


| u(z) ds = 


y 


[F-as- 
y 


(In (1.31), “.” refers to the dot product of vectors in the plane.) 
As a further reminder, the basic result known as the fundamental theorem of cal- 
culus for line integrals states that if F = Vu, then 


u(y(t))|y’ O| dt, (1.30) 


F(y(t)) - yt) dt. (1.31) 


Ren 8 


[E as = uy) - ua). 
y 


We are now ready to define contour integrals and arc length integrals, which are 
the complex-analytic analogues of line integrals of the first and second kinds (and are 
defined in terms of those integrals). For a function f = u + iv of a complex variable z and 
a curve y, the contour integral i5 f(z) dz (in words: the integral of f over the curve y) is 
defined, loosely speaking, as the line integral of the second kind “ ju + iv)(dx + idy)”. 
More precisely, expanding this product of a complex number and a complex differential 
and separating into real and imaginary components, this definition becomes 


(paroj pano) a 
[roa l vdy]+i [parea (1.32) 


that is, the complex number whose real part is the line integral of F - ds and whose 
imaginary partis the line integral of G-ds, where F and G are the vector fields F = (u, —v) 
and G = (v, u). Appealing to (1.31), you can check easily that the contour integral can be 
evaluated explicitly as the ordinary Riemann integral 


b 


[rod =|so@y'@ae (1.33) 


y a 


Similarly, the arc length integral is defined as 
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| f2 ldz] = | f(2) ds = | uds +i | vds, (1.34) 


y y 


which is simply a line integral of the first kind in which the integrand is complex-valued. 
If y is a closed curve, then we denote the contour integral as $, f(z) dz, and similarly 
$, f(z) |dz| for the arc length integral. 
A particular case of an arc length integral is the length of the curve, denoted len(y) 
and defined as the integral of the constant function 1: 


b 
len(y) = | idz = fiO] at. 
y a 


As mentioned above, our convention of mildly abusing terminology puts on us the 
burden of having to remember to check that these definitions do not depend on the 
parameterization of the curve. Indeed, if y4 ~ y are representatives of the same equiv- 
alence class of parameterized curves, that is, y,(t) = y,(I(t)) for some nicely behaved 
function, then using a standard change of variables in single-variable integrals, we see 
that 


d 


FOMO = [FATO oO at 


Cc 


[fea = 
y2 
b 


fA) AON O de= [SHOM = [fed a3 


a yı 


—— 8 OS 


The analogous verification in the case of arc length integrals is left as an exercise 
(Exercise 1.17). 

Contour integrals have many surprising properties, but the ones on the following 
list of basic properties are not of the surprising kind. 


Proposition 1.12 (properties of contour integrals). Contour integrals satisfy the following 

properties: 

(a) Linearity as an operator on functions: for functions f(z), g(z) and complex numbers 
a, B, we have 


[ro + Bg(z))dz=a [ro dz+B fso dz. 
y y y 
(b) Linearity as an operator on curves: if a contour T is a “composition” of two contours 
yı and y, (in a sense that is easy to define graphically but tedious to write down pre- 
cisely), then 
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[ro dz = [ro dz + [re dz. 
T 


yı y2 
Similarly, if y, is the “reverse” contour of y,, then 
| f@ az =~ | fe) ae 
y2 yı 


(c) Triangle inequality: 


roa 


y 


< fro |dz| < len(y) - sup|f(z)|. 
zey 


Proof. Exercise 1.18. 


Contour integrals have their own version of the fundamental theorem of calculus. 


Theorem 1.13 (The fundamental theorem of calculus for contour integrals). If y is a curve 
connecting two points w; and w, in a region Q on which a function F is holomorphic, then 


[ro dz = F(w,) — F (w4). 
y 
Equivalently, the theorem says that to compute a general contour integral I, f(z) ad, 


we try to find a primitive of f, that is, a holomorphic function F such that F’ (z) = f(z) 
on all of Q. (A term synonymous with “primitive” is antiderivative.) If we found sucha 
primitive, then the contour integral I, f(Z) dz is given by F(w2) - F(w)). 


Proof. For smooth curves, an easy application of the chain rule gives 


b b 
[roas [rooy oa- [Eod -= Enon 
y a a 
= F(y(b)) - F(y(a)) = F(w,) - Fw). 


For piecewise smooth curves, this is a trivial extension that is left to the reader. 


Many of our discussions of contour integrals will involve the behavior of integrals 
over closed contours and the interplay between the properties of such integrals and 
integrals over general contours. As an example of this interplay, the above result has an 
easy—but important—consequence for integrals over closed contours. 


Corollary 1.14. If f = F’ where F is holomorphic on a region Q—that is, f has a 
primitive—then for any closed contour y in Q, we have 


pro dz = 0. 
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This last result has the following partial converse. 


Proposition 1.15. Iff : Q — C is a continuous function on a region Q such that 


ste) dz =0 
y 


for any closed contour in Q, then f has a primitive. 


Proof. Fix some Zo € Q. For any z € Q, there is some curve y(Zo, Z) connecting Zo and 
z (since Q is connected and open, hence pathwise-connected—a standard exercise in 
topology). Moreover itis also not hard to see that the curve can be assumed to be piece- 
wise differentiable. Define 


F(z) = | f(w) dw. (1.36) 

Y(ZpZ) 
By the assumption this integral does not depend on which curve y(Z,,z) connecting Zo 
and z was chosen, so F(z) is well-defined. We now claim that F is holomorphic and its 


derivative is equal to f. To see this, note that if h is a complex number such that z+h € Q, 
then 


F(z +h) - F(z) E 


5 f@) 
al | f(w) dw - | faw) - f(z) 
Y(Zoz+h) y(Z,2) 
= * | f (w) dw - f(z) = | (f(w) -f(z)) dw, (1.37) 
y(z,z+h) y(z,z+h) 


where y(z,z + h) denotes a curve in Q connecting z and z + h. When |h] is sufficiently 
small so that the disc D(z) is contained in Q, we can take y(z, z + h) as the straight line 
segment connecting z and z + h. For such h, we get that 


F(z+h)-F(z) 
h 


f| < = len(y(Z,z + h) sup |f(w)-f(z)| 


weD),(zZ) 


= sup fw -f(z)| — 


0 
weD,(z) h0 


by the continuity of f. 


Lemma 1.16. Iff is holomorphic on Q and f' = 0, then f is a constant. 


Proof: Fix some Zo € Q. For any Z € Q, as we discussed above, there is a path y(Zp,Z) 
connecting Zo and z. Then 
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F2) -f(%) = | f'(w) dw = 0, 


Y(Zo,2) 


and hence f(z) = f (Zo), that is, f is constant. 


Suggested exercises for Section 1.6. 1.17, 1.18. 


1.7 The Cauchy, Goursat, and Morera theorems 


One of the central results in complex analysis is Cauchy’s theorem. 


Theorem 1.17 (Cauchy’s theorem.). Iff is a holomorphic function on a simply connected 
region Q, then for any closed curve in Q, we have 


pro dz = 0. 


The challenges facing us are as follows: first, to prove Cauchy’s theorem for curves 
and regions that are relatively simple (where we do not have to deal with subtle topolog- 
ical considerations); second, to define what “simply connected” means; third, to extend 
the theorem to the most general setting. This is done in the next section. 

Two other theorems closely related to Cauchy’s theorem are Goursat’s theorem, a 
relatively easy particular case of Cauchy’s theorem, and Morera’s theorem, which is a 
kind of converse to Cauchy’s theorem. 


Theorem 1.18 (Goursat’s theorem). Iff is holomorphic on a region Q, T is a triangle con- 


tained in Q, and OT is the boundary of T (considered as a curve in the usual sense), then 


f(z) dz = 0. (1.38) 
oT 


Theorem 1.19 (Morera’s theorem). Iff : 2 — C is a continuous function on a region Q 
such that 


bse) dz =0 
y 


for any closed contour in Q, then f is holomorphic on Q. 
Morera’s theorem is proved in Section 1.9. 


Proof of Goursat’s theorem. The proof can be summarized with a slogan “localize the 
damage.” Namely, try to translate a global statement about the integral around the tri- 
angle to a local statement about behavior near a specific point inside the triangle, which 
would become manageable since we have a good understanding of the local behavior of 
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a holomorphic function near a point. If something goes wrong with the global integral, 
then something has to go wrong at the local level, and we will show that cannot happen. 
(Although technically the proof is not a proof by contradiction, conceptually I find this 
a helpful way to think about it). 

The idea can be made more precise using triangle subdivision. Specifically, let T = 
T, and define a hierarchy of subdivided triangles: 


order 0 triangle: T®, 
order 1 triangles: T 1<j<4, 
order 2 triangles: Th 1<j,k <4, 


: i (3) ; 
order 3 triangles: Tikel <j,k,£ <4, 


order n triangles: Le jel Shee odin s4 


Here the triangles Te j for j, = 1,2,3,4 are obtained by subdividing the order-(n — 1) 
triangle NE , into 4 subtriangles whose vertices are the vertices and/or edge bisectors 


of T?) ; see Fig. 1.5. 


Ky, 


Figure 1.5: The triangle T = T® and the first few steps in its hierarchy of subdivided triangles. 
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Now, given the way this subdivision was done, it is clear that we have the relation 


4 
fla)de = Y. $ fede 


(n—1) h= n7 (n) 
jJi»-»jn-1 i 


(where ar,” j refers as before to the boundary of the triangle P , considered as a 
curve oriented in the positive sense) due to cancelation along the Geral edges, and 
hence 


4 
droaz- Yh for. 
Jinli 
aT i OT jn 
So the contour integral around the boundary of the original triangle is equal to the sum 
of the integrals around all 4” triangles at the nth subdivision level. Now a key obser- 
vation is that one of these integrals has to JA i m that is at least as big as the 
average, that is, there exists an n-tuple j(n) = sees je )) € {1,2,3, 4}” for which 


eT 


Moreover, we can choose j(n) inductively in such a way that the triangles Ti are nested, 


that is, T c T D for n > 1, or, equivalently, j(n) = (j\""”,..., jodi) for some 
1 < k < 4. To make this happen, choose a value of k for which Ihr 7 f(z) ad is 


j(n-1),k) 


4 
<4" 


ios f(z)dz 


OT jn 


KO d. (1.39) 


vd 


greater than or equal to the average 


iJ 
a2, 


1 
ari 


$ fo a, 


j(n-1),d) 


which in turn can be seen (by induction) to be greater than or equal to 


(z) dz| = i $ f 


n) n-1) 
“art j(n-1),d) ary j(n-1) 


ag ae 


thereby justifying (1.39). 
We now claim that the sequence of nested triangles Ti shrinks to a single point, 
that is, we have 


ñ Ti = {z 0 } 
n=0 
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for some point Zọ € T. Indeed, the diameter of the triangles goes to 0 as n — oo, so 
certainly there cannot be two distinct points in the intersection. On the other hand, the 
triangles Tn are all compact, and the finite intersections A 0 Ti are nonempty, so by 
the standard finite intersection property of compact sets the full intersection NPS o T 
is also nonempty. 


Having defined zo, write f(z) for z near Zg as 


F(Z) =f (Zp) + f'(Zp)(Z - Zo) + W(Z)(z - Zo), 
where 


f(z) — F(Z) 


Z — Zo 


(Zz) = - f'(Zo). 


The holomorphicity of f at z) implies that Y(z) > 0 as z > Zp. Denote by d™ the diam- 
eter of T and by p™ its perimeter. Each subdivision shrinks both the diameter and 
perimeter by a factor of 2, so we have 

da =2%a, pO =a Mp, 


It follows that 


| f(z)dz 


= | (f (Zo) +f" (Zo)(Z — Zo) + W(z)(z - Z9)) dz 


-| YZ) -= Zo) dz| < pa sup o| 
zeT" 


(n) ion 
PT 


4% Od sup lyo). 


(n) 
ZET 


This estimate allows us to finish, since combining it with (1.39), we get that 


<p™a™ sup |p(2)| zz O 


TO im 


which establishes (1.38). 


The next few results illustrate how Goursat’s theorem, for all its apparent simplicity, 
can be used to quickly derive even stronger versions of Cauchy’s theorem, gradually 
building up our knowledge toward the general version that will be proved in the next 
section. 


Corollary 1.20 (Goursat’s theorem for rectangles). Theorem 1.18 is also true when we re- 
place the word “triangle” with “rectangle.” 
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Proof. Obviously, a rectangle can be decomposed as the union of two triangles, with the 
contour integral around the rectangle being the sum of the integrals around the two 
triangles due to cancelation of the integrals going in both directions along the diagonal. 


Corollary 1.21 (existence of a primitive for a holomorphic function on a disc). If f is holo- 
morphic on a disc D, then f = F' for some holomorphic function F on D. 


Proof. The claim is identical to Proposition 1.15, but with a different set of assumptions. 
In fact, the proof of that proposition can be easily adapted to prove the existence of a 
primitive in the current setting. Specifically, we again define the purported primitive F 
for f using (1.36), but this time using a particular choice of path y(Zo, Z) connecting Zo 
and z, namely, we take y(Zọ, Z) to be the straight line segment from Zp to z. 

We now claim that with this definition, for h small in magnitude (so that z + h is 
still in the disc D), the chain of equalities (1.37) still holds, where in this chain, we also 
interpret y(z, Z + h) as the straight line segment connecting z and z + h. If we can show 
this, then the rest of the proof carries through as before. Now, upon inspection of (1.37), 
we see that the first and third equalities still hold trivially; it is only the middle equality 
that needs to be explained. This equality can be rewritten as 


J fonaws | faw- | fonaw=o, 


Y(Zo,2) y(Z,Z+h) y(Zp,2+h) 


a relationship between the contour integrals of f along the three straight line segments 
Y(Z9>2Z), Y(Zo; Z +h), and p(z,z + h). This is simply the statement that the contour integral 
along the boundary of the triangle with vertices Zp, Z, and z + h is 0, which follows from 
Goursat’s theorem. 


Theorem 1.22 (Cauchy’s theorem for a disc). Iff is holomorphic ona disc, then $f dz=0 
for any closed contour y in the disc. 


Proof. By Corollary 1.21, f has a primitive, so Corollary 1.14 implies the claimed conse- 
quence. 


1.8 Simply connected regions and the general version of Cauchy’s 
theorem 


We now develop the additional concepts required to formulate and prove the general 
version of Cauchy’s theorem. A key notion is that of homotopy of curves. Given a region 
Q c C, two parameterized curves y4, y, : [0,1] — Q (assumed for simplicity of notation 
to be defined on (0, 1]) are said to be homotopic (with fixed endpoints) if y,(0) = y,(0), 
yı(1) = y,(1), and there exists a function F : [0,1] x [0,1] — Q such that 

i) Fis continuous. 
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ii) F(0,t) = y,(t) for allt € [0,1]. 
iii) F(1, t) = y,(t) for all t € [0,1]. 
iv) F(s, 0) = y,(0) for all s € [0,1]. 
v) F(s,1) = y,(4) for alls € [0,1]. 


The map F is called a homotopy between y; and y». Intuitively, for each s € [0,1], the 
function F, : t > F(s, t) defines a curve connecting the two endpoints y,(0) and y,(1). As 
s grows from 0 to 1, this family of curves transitions in a continuous way between the 
curve y; and y, with the endpoints being fixed in place; see Fig. 1.6. 


Yı = Fo 


YG) = (1) 


y,(0) = ¥9(0) 
% =F, 


Figure 1.6: A homotopy between two curves y; and yy, visualized as a one-parameter family of curves 
t + F,(t) that interpolate continuously between y, and y,, with the endpoints staying fixed. 


A common alternative way to define the notion of homotopy of curves is for closed 
curves, where the endpoints are not fixed, but the homotopy must keep the curves closed 
as it is deforming them. The definition of a simply connected region then becomes a 
region in which any two closed curves are homotopic. It is not hard to show that those 
two definitions are equivalent. 

It is easy (but recommended!) to check that the relation of being homotopic is an 
equivalence relation; see Exercise 1.19. 

Next, we define the notion of a simply connected region. A region Q is called simply 
connected if any two curves y4, y, in Q with the same endpoints are homotopic. Note that 
this is a topological property (in the sense that it is preserved under homeomorphism). 
The complex plane, the unit disc, and any region homeomorphic to the unit disc are 
simply connected regions (Exercise 1.20). 


Theorem 1.23. Iff is a holomorphic function on a region Q, and yp,y; are two curves on 
Q with the same endpoints that are homotopic, then 


[ro dz = [ro dz. 
Yo Y 


Proof. As with the proof of Goursat’s theorem in the previous section, this proof is based 
on the idea of reducing the global statement about the equality of the two contour in- 


34 —— 1 Basic theory 


tegrals into a local statement. Denote by F : [0,1] x [0,1] — Q the homotopy between 
Yo and y,, and for any s € [0,1], denote by y, : [0,1] — C the curve y,(t) = F(s, t). The 
strategy of the proof is to show that there are values 0 = Sg < S4 < Sy < +++ < Sp = 1 such 
that 


| fou - | f@az=---= | f(z) dz = | fod. 


Vso Vsq Ysi Ysn 
In fact, we can take sẹ = k/n for 0 < k < n with large n; we will define n more precisely 
below. Fix 1 < k < n. To prove the equality between the two integrals l f(z) dz and 
Sk-1 


$ f(z) dz, we decompose each of the two integrals into a sum of integrals over small 
Sk 
pieces of the contours ys, , and ys, by writing them as 


| f@@=) | f(z) dz, (1.40) 
Ysk Flys lea) 

le (z)dz=) | f(z) dz. (1.41) 

Ysk a AT 


Here Vselltyt/] denotes the restriction of the contour y to the interval [t_,, tj], where t; 
denotes some sequence of points 0 = tọ < t4 <- < t, = 1 partitioning [0,1] into 
subintervals [t;_,, t;]. We will show at the end of the proof that the partition t; = j/n for 
0 < j < n-1, where n is large (and is the same n that was used for the definition of 
Ss; above), works well for our purposes. Specifically, we will show that with the way we 
defined sx and t; above and with n taken sufficiently large, the following assumption is 
satisfied: for all 1 < k,j < n, there exists an open disc D}; c Q containing the two curve 
segments Vsq allt. and Ysta] 

Under this assumption, to prove that the two integrals (1.40)-(1.41) are equal, it suf- 
fices to prove that for any 1 < j < n, we have the equality 


| foa- | tow (1.42) 


Ysrallt-ptD VsyIltj_apt)] 


between the integrals over the small subcontours. 

For each 0 < j < n, let n,; denote a straight line segment (considered as a param- 
eterized curve) from Vs,_,(G) to Vs, (tj) and for each 1 <j < m, letT, j denote the closed 
curve Ys, (tjo 1) + nkj — Vs, ([tj-1 G]) - Nk j-1 Gin words: the concatenation of the four 
curves Ys (ltr €]), Mx j “the reverse of y, ([t;4, ¢)]),” and “the reverse of n,;_1”). By 
the assumption on the disc Dx j the curve I’, is contained in Dx j. Therefore by Cauchy's 
theorem for discs (Theorem 1.22) we have 
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dro dz = 0, 


Ty 


or, more explicitly, 


| f (Z) dz - | f(z)dz = | rog- | foa. 


st] Nkj-1 Nkj 


Vsy_allty_s.t)] Voll; 


tj-rtj 


Summing this relation over j and recalling (1.40)-(1.41), we get that 


| rog- | fog- ¥( | roz- [ fe az) 


Ysk Ysg FU ` ngja Nj 


2 | roa- | f@az =o. 


"ko "km 


(Here, in the next-to-last step the sum is telescoping, and in the last step, we note that 
Nk, o and Nk, m are both degenerate curves, each of which simply stays at a single point.) 
This is precisely equality (1.42) we wanted. 

It remains to justify the assumption about the discs D,;. This is done as follows. 
First, since the set A = F([0,1] x [0,1]) is compact, it is easy to see (for example, using 
the Heine-Borel property) that there exists a number e > 0 such that the discs D,(z) are 
contained in Q for all z € A. Second, since F is continuous, and hence also uniformly 
continuous, on [0,1] x [0,1], there exists a number 6 > 0 such that for any 0 < s,t < 1 
with |s — s’| + |t - t'| < 6, we have 


Iys (t) - ys(0)| = |F(s', t) - F(s,)| < €. 


Let n be an integer larger than 2/6, and let sx = k/n and t; = j/n as before. We define the 
discs Dy; by Dx j = De(Vs,_,(t;-1)) and claim that they satisfy our assumption. Indeed, if 
t € [t1 t;], then |t-§_y| < 1/n < ô/2, s0 IVs, ,(O-Vs,_, -Dl < €. This shows that the curve 
segment Ys, altt] is contained in D; j. Similarly, |t — t;-4l + [Sk — Skl < 1/n +1/n < ô, so 
Ilys, (t) — Vs,_,(G-1)| < €, that is, the curve segment Yst] is also contained in Dẹ j. This 
proves that our assumption about the discs Dx; is satisfied and finishes the proof. 


Theorem 1.24 (Cauchy’s theorem, general version). If f is holomorphic on a simply con- 
nected region Q, then for any closed curve in Q, we have 


pon: 0. 


Proof. Assume without loss of generality that y is parameterized as a curve on [0,1]. 
Then it can be thought of as the concatenation of two curves y; and -y,, where y; = 
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Yito1/2} and yz is the “reverse” of the curve yj1/21;. Note that y, and y, have the same 
endpoints. By Theorem 1.23 we have 


[faae= | foz- | feerae- pede =o. 
y 


V1-Y2 yı y2 
Combining Theorem 1.24 with Proposition 1.15, we get the following result. 
Corollary 1.25. Any holomorphic function on a simply connected region has a primitive. 


One subtle issue that is glossed over in many complex analysis textbooks is the ques- 
tion of how to recognize when a region is simply connected. In many practical situations, 
it is easy to recognize or at least accept as intuitively plausible, that the region under dis- 
cussion is homeomorphic to a disc, which of course implies the property of being simply 
connected. This informal style of reasoning will be sufficient for our needs in this book. 
For those readers who prefer a higher level of rigor, we cite without proof the following 
result from topology. 


Theorem 1.26. Given any simple closed curve y in the plane, there is a region Q such that: 
1. Qis bounded; 

2. Q is the unique connected component of C \ y that is bounded; 

3. Qis homeomorphic to a disc. 


Because of the second property of Q given in the theorem, Q is usually referred to 
as “the region enclosed by y.” 

Theorem 1.26 is a version of the Jordan-Schoenflies theorem, which in turn is 
a strengthened version of the Jordan curve theorem. These results have elementary 
proofs that do not require complex analysis; see [9, 69] and [W6] for additional discus- 
sion and references. A planar curve that is simple and closed is often referred to as a 
Jordan curve. 


Suggested exercises for Section 1.8. 1.19, 1.20, 1.21, 1.22. 


1.9 Consequences of Cauchy’s theorem 


Theorem 1.27 (Cauchy’s integral formula). If f is holomorphic on a region Q containing 
the closed disc Dzp(Zq), then 


i fw) f(z) if z € Dg(Zo), 
w ; 
Jri d Wes dw = 40 ifz E€ Q \ Dep(D), (1.43) 
Cr(Zo) undefined ifz € CR(Zo) 


Proof. The case where z € Q \ Dzp(D) is covered by Cauchy’s theorem in a disc, since in 
that case the function w + f(w)/(w—z) is holomorphic in an open set containing D-p(D). 
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Figure 1.7: The keyhole contour Te 5- 


It remains to deal with the case z € Dp(Zp). In this case, denote F,(w) = f (w)/(w - z). The 
idea is now to consider instead the integral 


$ F,(w) dw = $ fw) dw, 


where I’, s is a so-called keyhole contour, namely a contour comprising a large circular 
arc around Zo that is a subset of the circle Cp(Z9), and another smaller circular arc of 
radius e centered at z, with two straight line segments connecting the two circular arcs 
to form a closed curve, such that the width of the “neck” of the keyhole is ô. (Here e and 
ô are two small positive parameters; think of e as being small and of 6 as being much 
smaller than e.) See Fig. 1.7. Note that the function F,(w) is holomorphic inside the region 
enclosed by I’, s. Moreover, this region is clearly homeomorphic to a disc and so is simply 
connected. Therefore Cauchy’s theorem gives that 


F,(w) dw = 0. 
Tes 


We now take the limit of this equation as 6 — 0. The two parts of the integral along the 
“neck” of the contour I’, s cancel out in the limit because F, is continuous, and hence 
uniformly continuous, on the compact set Dep(Zp) \ D(z). So we can conclude that 


F,(w) dw = F,(w) dw. (1.44) 
Cp (Zo) C.(2) 


The next and final step is to take the limit as e — 0 of the right-hand side of this equation. 
Write 
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a Le) 


F,(w) = +f(Z)-—— (1.45) 


Integrating each of these two terms separately, for the first term, we have 


| 6 ee FD dwl < oze. üp yw) Fo 
C2) |w-z|=e 
=2n sup |f(w)- fal a 0 (1.46) 
|w-z|=€ 
by the continuity of f; and for the second term, 
f(z) — aw = f(2) $- — aw = 2mif (z) (1.47) 


C,(Z) C.(Z) 


(by a sends calculation; see Exercise 1.21). Combining (1.44) and (1.47) gives that 
LE „(w)dw = f(z), which was the formula to be proved. 


fe 'R(Zo) 27i 


An important particular case of (1.43) is the one in which z = Zọ. Cauchy’s integral 
formula gives in this case that 


f= = d fw) 7 | f(z + Re)at. 
0 


a = 27 
Cr(Zo) 


In other words, we have proved the following result. 


Theorem 1.28 (Mean value property for holomorphic functions). If f is holomorphic on a 
region Q containing the closed disc Dzg(Zo), then the value f (zo) is equal to the average of 
the values off around the circle Cp(Zo). 


Considering what the mean value property means for the real and imaginary parts 
of f = u+iv, which are harmonic functions, we see that they in turn also satisfy a similar 
mean value property: 


27 
u(x, y) = = | u(x + Rcost,y + Rsint) at. (1.48) 
0 


In fact, (1.48) holds for all harmonic functions and is a result known as the mean value 
property for harmonic functions. This result is proved in many textbooks using meth- 
ods from real analysis or partial differential equations. Alternatively, it can be derived 
from the above considerations by proving that every harmonic function in a disc is the 
real part of a holomorphic function. 
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Theorem 1.29 (Cauchy’s integral formula, extended version). Under the same assump- 
tions as in Theorem 1.27, f is differentiable infinitely many times, and for z € Dp(Zo), its 
derivatives f™ (z) are given by 


(n);,, _ N! fw) 
f= oni i 7 waa dw. (1.49) 


Proof. We prove by induction that for all n > 0, f(z) exists, is differentiable, and is 
given by the expression on the right-hand side of (1.49). For n = 0, this is the statement 
of (1.43) in the case z € Dp(Z). For the inductive step, assuming that we have proved the 
claim for a given value of n, the idea is now to show that the expression on the right- 
hand side of (1.49) can be differentiated under the integral sign. More precisely, observe 
that, by the inductive hypothesis, if z +h € Dp(Zp) (which is the case where h is close 
enough to 0), then 


ern-a nl -( 
h Tu ter ew 


z= hy (w = zn 


n-1 n-1 
It is easily seen that as h — 0, the divided difference (wes h) we converges to 


(n+1)(w -z)"%, uniformly over w € C. (The same claim without the uniformity is just 
the rule for differentiation of a power function; to get the uniformity, we need to “go 
back to basics” and repeat the elementary algebraic calculation that was originally used 
to derive this power rule; we leave this as an exercise.) It follows that 


eae fiw) 
am a i my EOE — zyme a 
Cp(Zo) 
-0 g fw) 
~ 2i cb) (w — z)"+2 


(1.50) 


This implies that f"*1(z) exists and is equal to the last expression in (1.50), which was 
precisely the claim in the (n + 1)th case. The induction is complete. 


In Theorem 1.29, we have stated one of the most remarkable facts about holomor- 
phic functions but hid it inside a technical-looking claim in a way that makes it seem 
almost like an afterthought. Let us state it more explicitly to pay it proper respect. 


Theorem 1.30 (Infinite differentiability of holomorphic functions). Ifa function f ofa com- 
plex variable is holomorphic in a region Q, then it is differentiable infinitely many times 
there. 


The real-analysis analogue of Theorem 1.30 is, of course, (very) false. As another 
illustration of how remarkable this result is, recall that in Section 1.4, we proved that 
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the real and imaginary parts of a holomorphic function are harmonic functions subject 
to the extra assumption that those functions are twice continuously differentiable. We 
now see that this assumption is not needed and the conclusion that u, v are harmonic 
already follows just from the holomorphicity assumption. Moreover, as an added bonus, 
we also get “for free” the statement that u and v are themselves infinitely many times 
differentiable; that is, they are C” functions. (The fact that harmonic functions are C® 
can also be proved just using real analysis techniques, but it is nonetheless pleasing to 
see it emerging out of the theory we are developing.) 


Proof of Morera’s theorem. We already proved that iff is a function all of whose contour 
integrals over closed curves vanish, then f has a primitive F. By Theorem 1.29, F’ =f is 
also holomorphic. 


As another immediate corollary to the (extended) Cauchy integral formula, we now 
get an extremely useful family of inequalities that bounds a function f(z) and its deriva- 
tives at some specific point z € C in terms of the values of the function on the boundary 
of a circle centered at z. 


Theorem 1.31 (Cauchy inequalities). For f holomorphic in a region Q that contains the 
closed disc Dzp(z), we have 


Fo] <n!R™ sup |f(w)|. (1.51) 


|w-z|=R 


Yet another remarkable fact we can now prove is the equivalence between the class 
of holomorphic functions and the class of functions that are locally expressible as power 
series. One direction in this equivalence—the easy one—was already proved in Theo- 
rem 1.9. The other is given in the following result. 


Theorem 1.32 (Holomorphic functions have convergent power series). Iff is holomorphic 
in a region Q that contains a closed disc Dzp(Zq), then f has a power series expansion at Zg 


fO =} aZ- 29)", 
n=0 


which is convergent for allz € Dg(Zo). The coefficients a, in this expansion are given (in 
accordance with (1.28)) by a, = f™(Zo)/n!. 


Proof. The basic idea here is that Cauchy’s integral formula gives us a representation of 
f(z) as a weighted “sum” (in fact, an integral, which is a limit of sums) of functions of 
the form z + (w —z) 1. Each of the functions in the weighted sum has a power series 
expansion since it is, essentially, a geometric series, so the sum also has a power series 
expansion. 
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To make this precise, write 


1 1 o1 1 
W-Z (W-Zọ)-(Z-Zọ) W-Zọ 1- Cam 
oe) n co 
R 1 > ( Z- Zo ) = yw = oye _ Zo)”. 
W -= Zo a0 \W -Zo n=0 


This is a power series in Z — Zo, which, for any fixed w € CR(Zọ), converges absolutely for 
all z such that |z—Zo| < R (that is, for all z € Dg(Zo)). Moreover, the convergence is clearly 
uniform in w € Cp(Z). Since infinite summations that are absolutely and uniformly 
convergent can be interchanged with integration operations, we then get, appealing to 
both the regular and extended versions of Cauchy’s integral formula, that 


eae f(w) 
f= mi) wen 


1 z -n- n 
=- d Fo Y w-z) \(2— zo)" dw 


Cr(Zo) 


Cr(Zo) 


= S $ fonw-zo"daw)e-zo" 
f 


which is precisely the expansion we were after. 


Theorem 1.33 (Liouville’s theorem). A bounded entire function is constant. 


Proof. Let f be bounded and entire, and let M = sup, <¢ |f(Z)| < co. By the case n = 1 of 
the Cauchy inequalities (1.51), for any z € C and R > 0, we have 


„~M 


Taking the limit as R — oo gives that f’(z) = 0. Since f” is identically 0, f is constant by 
Lemma 1.16. 


Exercises 1.23, 1.24, and 1.25 explore some additional ideas related to Liouville’s the- 
orem and additional results that can be proved using a similar technique. 


Proposition 1.34. Iff is holomorphic on a region Q, and f(z) = 0 for z in a set containing 
a limit point in Q, then f is identically zero on Q. 


The condition that the limit point Z) is in Q in this result is needed. For example, the 
function e7 — 1 has zeros in every neighborhood of Zo = 0 but is not identically zero. 
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Proof of Proposition 1.34. Let Zz) € Q be a limit point of zeros of 0. This means that there 
is a sequence (w,)2, of points in Q such that f (w) = 0 for all n, wy > Zg as k — oo, and 
Wx + Zo for all k. We know that in a neighborhood of Zp, f has a convergent power series 
expansion. If we assume that f is not identically zero in a neighborhood of Zp, then we 
can write the power series expansion as 


0 


f=} a,(z-2)"= } aZ- z)" 
n=0 


n=m 


= am (Z 29)" Y EE - zo)" = a(z - z0)”(1+ 80), 
n=0 m 


where we define m to be the smallest index such that a + 0, and define g(z) = 


CO Anm 


n1 a Z5 Zo)”. Note that g is a holomorphic function in a neighborhood of zo that 


satisfies g(Zọ) = 0. It follows that for all k, 
Am (Wy - Zo)" (1 + (Wy) = f (Wx) = 0, 


but for large enough k, this is impossible, since wx —Zọ + 0 for all k and g(w;,,) > g(Zo) = 
0ask > oo. 

The conclusion is that f is identically zero at least in a neighborhood of z). Now we 
claim that this also implies that f is identically zero on all of Q, because Q is a region 
(open and connected). More precisely, denote by U the set of points z € Q such that f 
is equal to 0 in a neighborhood of z. It is obvious that U is open; U is also closed by the 
argument above, which shows that any point that is a limit of points in U must be in 
U; and U is nonempty (it contains Zọ, again by what we showed above). It follows that 
U = Q by the well-known characterization of a connected set in the plane as a set E that 
has no “clopen” (closed and open) sets other than the empty set and EF itself. 


Proposition 1.34 has an equivalent form that is more memorable, given in the next 
result. 


Theorem 1.35 (Zeros of holomorphic functions are isolated). If f is holomorphic on Q, is 
not identically zero on Q, and f (Zo) = 0 for Zo € Q, then for some e > 0, the punctured 
neighborhood D,(Zo) \ {Zo} of Zo contains no zeros of f. In other words, the set of zeros of 
f contains only isolated points. 


Corollary 1.36. If f,g are holomorphic on a region Q, and f(z) = g(z) for z ina set with 
limit point in Q (e. g., an open disc or even a sequence of points z, converging to some 
Z € Q), then f = g everywhere in Q. 


Proof. Apply the previous result to f - g. 


The previous result is usually reformulated slightly as the following conceptually 
important result. 


1.9 Consequences of Cauchy’s theorem =—— 43 


Theorem 1.37 (Principle of analytic continuation). If f is holomorphic on a region Q, and 
f, is holomorphic on a bigger region Q, > Q and satisfies f,(z) = f(z) for all z € Q, then 
f, is the unique such extension, in the sense that if f, is another function with the same 
properties, then f,(z) = f,(z) for allz € Q,. 


The function f, in Theorem 1.37, if it exists, is usually referred to as the analytic 
continuation of f. 

The principle of analytic continuation is of fundamental importance in complex 
analysis. One of the common ways in which it is used is as a tool for justifying the con- 
struction of interesting holomorphic functions in several stages, where one starts by 
defining the function on a small region and then shows how to extend the definition toa 
larger region (see Chapter 2 for two of the most famous examples of this idea). There are 
often several ways of performing the extension, with no single one of them being neces- 
sarily more natural or canonical than the others, so we typically appeal to the principle 
of analytic continuation to explain why we end up with the same extended function 
regardless of which particular construction is used. In that sense, the principle of ana- 
lytic continuation gives a philosophical justification for regarding naturally occurring 
holomorphic functions, such as the Euler gamma function and Riemann zeta function 
discussed in Chapter 2, as having a kind of idealized Platonic existence that transcends 
any particular formula used to represent them. 

This philosophical point of view can be illustrated in an amusing way in a more 
elementary setting. In real analysis, we learn that “formulas” such as 


1-141-141-14---= <, (1.52) 


NIe 


14+2+4+84164+32+---=-1 (1.53) 


do not have any meaning, despite the fact that they can be easily “proved” using alge- 
braic manipulations of a somewhat dubious nature. However, in the context of complex 
analysis, we can in fact make perfect sense of such identities using the principle of an- 
alytic continuation! Do you see how? (Exercise 1.27.) Additional seemingly meaningless 
formulas of this type, beloved by complex analysts and recreational mathematicians 
alike, are 

1 


14+24+34+44+-.-=-_, (1.54) 
12 


1243-44 = 7, (1.55) 


These formulas have attracted considerable attention in recent years, being the subject 
of a popular online video [W7], newspaper articles [W8], discussions on mathematics 
blogs and forums [W9], [W10], [W11], a Wikipedia article [W12], and more. We will learn 
in Chapter 2 that they, too, can be given a formal meaning that is no less precise or 
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rigorous than the formulas involving convergent series that you are more familiar with 
from real analysis; see Exercise 2.11. 

We now discuss a particular case of analytic continuation that constitutes the most 
minimalistic kind of continuation we can imagine, namely, a scenario in which a holo- 
morphic function is extended to a region that is larger by a single point relative to the 
original domain on which it is defined. This is usually described in terms of the so-called 
removable singularity. A point Z) € Q is called a removable singularity of a function 
f : Q — CU {undefined} if f is holomorphic in a punctured neighborhood of Zo, is not 
holomorphic at Zo, but its value at Z) can be redefined so as to make it holomorphic at 
Zp, that is, if we can perform an analytic continuation of f from Q \ {Zo} to Q. Of course, 
in this case the fact that the analytic continuation is unique is trivial; the issue here is to 
understand when the continuation exists, and the next result gives a useful condition. 


Theorem 1.38 (Riemann’s removable singularities theorem). If f is holomorphic in Q ex- 
cept at a point Z) € Q (where it may be undefined or be defined but not known to be 
holomorphic or even continuous). Assume that f is bounded in a punctured neighborhood 
D,(Z) \ {Zo} of Zo. Then Zo is a removable singularity of f. 


Proof. Fix some disc D = Dp(Zp) around Zo whose closure is contained in Q. Define the 
function 


Pee fw) 
f2 = z d ome w (zeD). (1.56) 


Cr(Zo) 


We claim that f extends f to a holomorphic function on D, which requires showing that 
f(z) = f(z) for allz € D\{Zọ}and that f is holomorphic at zp. For the first part of the claim, 
letz € D\{Zo}. Consider a “double keyhole” contour Ke s that surrounds most of the disc D 
but makes diversions to avoid the points Z, and z, circling them in the negative direction 
around most of a circle of radius e (Fig. 1.8). We assume that 0 < ô < € < tz — Zol. Now 
the region enclosed by K, s is simply connected, so, after applying Cauchy’s theorem and 
a limiting argument similar to that used in the proof of Theorem 1.27 (taking the limit as 
ô — 0 with e fixed), we get that 


F@ = 2. E w dw + -L FW) ay. (1.57) 


On the right-hand side, the first term is equal to f(z) by a straightforward application 
of Cauchy’s integral formula. The second term can be bounded in magnitude using the 
assumption that f is bounded in a neighborhood of zy); more precisely, denote M = 
SUP weD,(z4)\{z} f(W) < co. We have 


| d £ Ew l aw 


Ce(Zo) 


<2ne sup |f(w)|- ue 
WEC, (Zp) Z—Z|-e€ |Z- Zol 


1.9 Consequences of Cauchy’s theorem =—— 45 


Figure 1.8: The double keyhole contour Ke s- 


Thus the claim that f(z) = f(z) follows by taking the limit of (1.57) as e — 0. 

Itremains to prove that f defined in (1.56) is holomorphic at Zọ. This is easy to see and 
is something we already knew implicitly. For example, the relevant argument (involving 
a direct manipulation of the divided differences ife +h) -f(z))) appeared in the proof 
of Theorem 1.29. Another approach is to show that integrating f over closed contours 
gives 0 (which requires interchanging the order of two integration operations, which 
will not be hard to justify) and then use Morera’s theorem. The details are left as an 
exercise. 


We now introduce the concept of uniform convergence on compact subsets. If f 
and (f,);2, are holomorphic functions on a region Q, we say that the sequence f, con- 
verges to f uniformly on compact subsets if for any compact set K c Q, fa(z) > f(z) 
uniformly on K. This mode of convergence is preserved under differentiation, as the 
following result makes precise. 


Theorem 1.39. Iff, — f uniformly on compact subsets in Q and f, are holomorphic, then 
f is holomorphic, and f! — f' uniformly on compact subsets in Q. 


Proof. The fact that f is holomorphic can be shown through a combination of Cauchy’s 
and Morera’s theorems. More precisely, note that for each closed disc D.,(Z)) € Q, we 
have f,(z) — f uniformly on D.,.(Z)). In particular, for each curve y whose image is 
contained in the open disc D,.(Z)), 


| fale) de — | fee) de, 


y y 


By Cauchy’s theorem the integrals in this sequence are all zero, so I, f(z) dz = 0. Since 
this is true for all curves y in the disc D,(Z)), by Morera’s theorem, f is holomorphic 
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on D,(Zg). This holds for any disc whose closure is in Q, and holomorphicity is a local 
property, so we have shown that f is holomorphic on all of Q, as claimed. 

Next, to show that f’ — f’ uniformly on compact sets, we start by proving that uni- 
form convergence holds on a certain family of discs. Let D,.(Zy) be a disc whose closure 
is contained in Q. For z € D,(Z 9), we have by Cauchy’s integral formula that 


faw) dw 1 fw) 


w- z) ni (w -= z)? 
r\Zo 


f(z) -f' (2) = 


14 “ftw “fW) | 


(w -= z)? 
Ea (Zo) 


This implies that f; (z) — f’ (z) as n — oo, uniformly as z ranges on the disc D,/(Z9), since 
fa(w) — f (w) uniformly for w € C,(Zọ) c D<r(Zo), and since the bound \w-z|7? < (r/2) 
holds for z € D,;2(Zo) and w € C,(Zo). 

Now let K c Q be compact. For each z € K, let r(z) be the radius of a closed disc 
D.yz)(Z) around z that is contained in Q. The family of discs {B, := Dyzyj2(Z) : z € Q} 
is an open covering of K, so by the Heine—Borel property of compact sets it has a finite 
subcovering B, ,...,B, . We showed that f;(z) — f'(z) uniformly on every B,,, so we 
also have uniform convergence on their union, which contains K, so we get that f! — f’ 
uniformly on K, as claimed. 


Suggested exercises for Section 1.9. 1.23, 1.24, 1.25, 1.26, 1.27. 


1.10 Zeros, poles, and the residue theorem 


We say that a complex number Z, is a zero of a holomorphic function f iff (z)) = 0. Zeros 
in complex analysis behave rather like zeros of polynomial, in the sense that a zero must 
have an integer multiplicity, known as its order. More precisely, we say that Z, is a zero 
of order m > 1 of a nonconstant holomorphic function f if it can be represented in the 
form 


f (2) = (Z - Zo)” g (z) (1.58) 


in some neighborhood of zy), where m > 1, and g is a holomorphic function in that neigh- 
borhood such that £ (Zo) + 0. A zero of order 1 is called a simple zero. 


Lemma 1.40. The order of a zero is a well-defined concept. That is, if f is a nonconstant 
holomorphic function and f (Z,) = 0, then representation (1.58) with the properties of g as 
given above exists for a unique integer m > 1. 
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Proof. We make use of power series expansions and a calculation similar to that used 
in the proof of Proposition 1.34. Write the power series expansion (known to converge 
in a neighborhood of zo) 


f(2) = ¥ Z-z)" = } an- zo)", 
n=0 n=m 


where m is the smallest index > 0 such that am + 0. Since ap = f (Zo) = 0, it must be the 
case that m > 1. If we now define 


8(2) = È Amsx(Z - Zo)“, 
k=0 


then clearly f(z) = (Z — Z))""g(z), and (Zo) = am + 0; this proves the existence of 
representation (1.58). On the other hand, given a representation of this form, expanding 
g(z) as a power series around Zg shows that m has to be the smallest index of a nonzero 
coefficient in the power series expansion of f(z) around Zp. This proves the uniqueness 
claim. 


In the definition above, in the case where Zg is not a zero of f, the same represen- 
tation (1.58) holds with m = 0 (and g = f), so in certain contexts, we may occasionally 
describe this situation by saying that Z, is a zero of order 0. 

If f is holomorphic in a punctured neighborhood of a point Zp, then we say that it 
has a pole of order m at Z, if the function h(z) = 1/f (z) (defined to be 0 at Zo) has a zero 
of order m at Zp. A pole of order 1 is called a simple pole. As with the case of zeros, we 
can extend this definition in an obvious way by saying that f has a pole of order 0 if f 
is holomorphic at z, or has a removable singularity there, and the value f (zo) (or the 
redefined value lim, _,,, f(z) that makes f holomorphic at Zo in the case of a removable 
singularity) is nonzero. 


Lemma 1.41. A function f has a pole of order m at Z, if and only if it can be represented 
in the form 


f(2) = (2-29) "g(2) 


in a punctured neighborhood of Zo, where g is holomorphic in a neighborhood of z) and 
satisfies 2 (Zo) + 0. 


Proof. Apply the previous lemma to 1/f (z). 


Theorem 1.42. Iff has a pole of order m at Zp, then it can be represented in a unique way 
as 


= a_m -m+1 An a 
f@= Z-z)" + Gay" prec F + G(z), (1.59) 
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where G is holomorphic in a neighborhood of Zo, and @_;,...,Q_; are arbitrary complex 
numbers with a_n # 0. 


Proof. The function g(z) = (Z — Zo)™f (z) is holomorphic in a neighborhood of zy and 
satisfies g(Z)) + 0. Write its power series expansion as 


g2) = È by (z= Zo)" (1.60) 
n=0 
= bo + by(Z— Zo) + +++ + Dy a(Z - 29)” + È bml- Zo)". (1.61) 


Here by = g(Zo) + 0. Now defining G(z) = XPS m Dm(Z - Zo)" ™ and converting (1.61) to 
an expression for f, we get that 


= eae bmi 
f(z)= aea + Goa) feet ase + G(z), 


which is of the correct form (1.59) if we further define a j= bm-j for 1 < j < m. This 


proves the existence part of the claim; the uniqueness part is left as an easy exercise. 


In representation (1.59) the expression 


= Am A_m+1 ne a1 
POTE (Z - Zo)™ 7 (z-Z)™1 ae tay = Zo 


is called the principal part of f at the pole zy). The coefficient a_ is called the residue 
of f at Zo and denoted Res, (f). 

The definitions of the order of a zero and a pole can be unified into a single con- 
sistent definition of the (generalized) order of a zero, where if f has a pole of order 
m at Zo, then we say instead that f has a zero of order —m. Denote the order of a zero 
off at Zy—an integer, which may be positive, negative, or zero—by ord, (f). With these 
definitions, it is easy to check (Exercise 1.28) that 


ordz, (f + g) > min(ord,, (f), ordz, (g)), (1.62) 
ord, (fg) = ord, (f) + ord, (g). (1.63) 


The residue theorem is a famous formula for evaluating integrals around closed 
contours of functions holomorphic inside the region enclosed by the contour, except 
for a discrete set of points. This theorem, like Cauchy’s theorem, has several different 
formulations addressing different levels of generality. We further give three versions of 
the theorem, which are sufficient for our needs. 
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Theorem 1.43 (The residue theorem; simple version). Assume that f is holomorphic in a 
region containing a closed disc Dzp(Z), except for a pole at zy € D. Then 


f(z) dz = 2ni Res, (f). 


Cp (Zq) 


Proof. By the standard argument involving a keyhole contour we see that the circle 
Cp(Zo) in the integral can be replaced with a circle C,(Z)) of a small radius e > 0 around 
Zo, that is, we have 


d faz = fO dz. 


Cr(Zo) Ce(Zo) 


When e is small enough, to evaluate the integral over C,(Zọ), we can use decomposi- 
tion (1.59) of f into its principal part and the remaining holomorphic part. Integrating 
the right-hand side of (1.59) termwise over the contour C,(Z) gives 0 for the integral of 
G(z) by Cauchy’s theorem; 0 for the integral powers (z=Zy)* with -m < k < -2 by a stan- 
dard computation (Exercise 1.21); and 27i a_, = 2mi Res; (f) for the integral of (z — Za 
by the same standard computation. This gives the result. 


Theorem 1.44 (The residue theorem for discs). Assume that f is holomorphic in a region 
containing a closed disc Dzp(Zq), except for a finite number of poles at Z,,...,Zy € Dp(Zo). 
Then 


N 
fle) dz = 27i Y Res, (f). 
Cr(Zo) fA 


Proof. The idea is the same as in the proof of Theorem 1.43, except that now we use a 
contour with multiple keyholes (one for each z;) to deduce after a limiting argument that 


$ OE $ feaz 


Cr(Zo) keleta 


for a small enough e, and then proceeds as before. 


Theorem 1.45 (The residue theorem for simple closed contours). Let f be a function de- 
fined in a region Q containing a simple closed curve y (oriented in the positive direction). 
Denote by R, the region enclosed by y. Assume that f is holomorphic everywhere in Q 
except for the finite set of points z4, ...,Zy € Ry, where it has poles. Then 


N 
d fle) dz = ani Y Res, (f). 
y k=1 
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Sketch of proof. Again, construct a multiple keyhole version of the original contour y 
and then use a limiting argument to conclude that 


pae 5 $ fod 


k=1 c izy) 


for a small enough e. Then proceed as before. 


Suggested exercises for Section 1.10. 1.28. 


1.11 Meromorphic functions, holomorphicity at co, and the 
Riemann sphere 


We extend the notion of holomorphicity in two directions by introducing the notions of 
meromorphicity and holomorphicity at oo. First, a function f : Q — C u {undefined} 
on a region Q is called meromorphic if f is holomorphic except for a discrete set of 
points, all of which are poles of f. 

Second, let U c C be an open set containing the complement C \ Dzp(0) of a closed 
disc around 0. A function f : U — C is holomorphic at oo if g(z) = f(1/z) (defined on 
a neighborhood D,;,(0) of 0) has a removable singularity at 0. In that case, we define 
f(co) = g(0) (more precisely, the value that makes g holomorphic at 0). 

Conceptually, the above definitions can be thought of as extending the notion of 
what a complex number is to include an additional “point at infinity.” Formally, we de- 
fine the set of extended complex numbers, also known as the Riemann sphere, as the 
set C = CU {oo} equipped with several layers of additional structure: 

— Topological structure. We think of C as the one-point compactification of C; that 
is, we add to C an additional element oo and say that the open neighborhoods of co 
are the complements of compact sets in C. This turns C into a topological space in a 
simple way. 

— Geometric structure. We can identify C with an actual sphere embedded in RS, 
namely 


1% 1 
= [ær eR’: Xay (z- >) = i} 
(the sphere of radius 1/2 centered at (0,0, 1/2)). The identification works as follows: 
the point at co is identified with the north pole (0, 0, 1) of the sphere; for other points, 
the identification (X, Y, Z) € S? > a+ ib € Cis given by two reciprocal relations 
Y 


+i, 


a+ ib = ia 
1- 1-Z 


b a+b? l 


X,Y,Z 
( g = a + b? 1+ a+b? 1402 +b? 
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(0, 0, 1) 


(0, 0,0) 


a+ib 


Figure 1.9: The Riemann sphere = $? and the translation between points a + ib on the complex plane 
and points P = (X, Y, Z) on the sphere via stereographic projection. The equator on the sphere is mapped 
to the unit circle in C. 


Geometrically, this identification corresponds to stereographic projection, where 
the point a +bi is calculated from P = (X, Y, Z) by projecting the straight line segment 
from the north pole (0, 0,1) to P further out onto its unique intersection point with 
the x-y plane, identified with the complex plane C in the obvious way; see Fig. 1.9. We 
can check without difficulty that this geometric identification is a homeomorphism 
between $°, equipped with the obvious topology inherited from R°, and C with the 
one-point compactification topology defined above. 

- Holomorphic structure. The above definition of what it means for a function on a 
neighborhood of co to be holomorphic at oo provides a way of giving C the structure 
of a Riemann surface (the simplest nontrivial case of a manifold with a complex- 
analytic structure). We will not discuss the topic of Riemann surfaces here; for more 
details on this point of view, see, e. g., [23, 60]. 


From this new point of view of the Riemann sphere, the concept of a meromorphic func- 
tion f : Q — C u {undefined} can be seen to coincide with the notion of a holomorphic 
function f : Q — C; that is, the underlying concept of the definition is still holomor- 
phicity, but it applies to functions taking values in C, a different Riemann surface, in- 
stead of C. Similarly, the idea of a function f : 2 — C being holomorphic at oo corre- 
sponds exactly to the notion of a function whose “true” domain of definition is actually 
Q U {oo} in the sense that it can be extended to a holomorphic function on this larger 
domain. 
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To conclude this section, we also generalize the notion of the order of a zero or pole 
at a point to include the behavior at the point at oo. Let U c C be an open set containing 
the complement C \ Dzp(0) of a closed disc around 0. We say that a function f : U > C 
has a zero (resp., pole) of order m at oo if g(z) = f(1/z) has a zero (resp., pole) at z = 0 
after appropriately defining the value of g at 0. 


1.12 Classification of singularities and the Casorati-Weierstrass 
theorem 


If a function f : Q — Cu {undefined} is holomorphic in a punctured neighborhood 

D,(Zq) \ {Zo} of Zo, then we say that f has a singularity at zọ if f is not holomorphic at Zp. 

We classify singularities into three types, two of which we already defined: 

- Removable singularities: when f can be made holomorphic at z, by defining or 
redefining its value at Zo; 

— poles; 

— any singularity that is not removable or a pole is called an essential singularity. 


For a function defined on a neighborhood of oo that is not holomorphic at oo, we say 
that f has a singularity at oo and classify the singularity as a removable singularity, a 
pole, or an essential singularity according to the type of singularity that z + f(1/z) has 
atz=0. 

The function z + e’/” is an example of a function with an essential singularity at 
the point z = 0. Its behavior near that singularity is rather difficult to visualize. Indeed, 
the next result shows that this is the case more generally. 


Theorem 1.46 (Casorati-Weierstrass theorem). Iff is holomorphic in a punctured neigh- 
borhood D,(Zo) \ {Zo} of Zo and has an essential singularity at Zo, then the image f (D, (Zo) \ 
{Zo}) of the punctured neighborhood under f is dense in C. 


Proof. We prove the contrapositive of the claim: assume that for some r > 0, the im- 
age f(D,(Zo) \ {Zo} is not dense. Then the closure cl(f(D,-(Zq) \ {Zo})) of this image does 
not contain some point w € C. It follows that the function g defined by g(z) = jon is 
holomorphic and bounded in D, (Zo) \ {Zo}. By Theorem 1.38 its singularity at Zg is remov- 
able, so we can assume that it is holomorphic at Zg after defining its value there. It then 


follows that 


1 
f(z) =w+ — 
8(Z) 
has either a pole or a removable singularity at Zo, that is, the singularity at Zo is not 
essential. 
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1.13 The argument principle and Rouché’s theorem 


We define the logarithmic derivative of a holomorphic function f(z) as the function 
f'(2)/f (z). Intuitively, this can be thought of as “the derivative of the logarithm of f.” A 
word of caution is in order however: we have not actually defined what “the logarithm 
of f” means, and when we actually define it a bit later (in Section 1.15), we will see that 
“the logarithm off” does not always exist. The logarithmic derivative on the other hand 
clearly exists, so it is best to get used to thinking about it as a separate concept from that 
of a logarithm rather than being derived from it. 


Lemma 1.47. The logarithmic derivative ofa product of holomorphic functions is the sum 
of their logarithmic derivatives, that is, 


(Ties fi) p Z kO 
ial Garko 


Proof. Show this for n = 2 and proceed by induction. 


Theorem 1.48 (The argument principle). Assume that f is meromorphic in a region Q and 
that y is a simple closed contour in Q enclosing a region R, such that f has no zeros or 
poles on the circle y. Denote its zeros and poles inside R, by z4, ... , Zn, where Zę is a zero 
of generalized order m, = ord,, (f) (in the sense discussed in Section 1.10, where mx is a 
positive integer ifz), is a zero and a negative integer if z;, is a pole). Then 


f' (2) az 
zif m” 7 -> Mk 


= [total number of zeros off inside Ry, counting multiplicities] 


- [total number of poles of f inside R,, counting multiplicities]. 


Proof. Define 
g2) =| [@- 2%) ™f. 
k=1 


Then g(z) is meromorphic on Q, has no singularities or zeros on y, and has no poles or 
zeros inside R,, only removable singularities at z,,...,Z, (so after redefining its values 
at these points, we can assume that it is holomorphic on Ry). It follows that 
n 
fO =| ]@-2%)*g(2). 


k=1 


Taking the logarithmic derivative of this equation gives that 
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f'2 _ om _ &@) 
fz) ka 2 ~ 27k &(Z) ‘ 


The result now follows by integrating this equation and using the residue theorem (the 
term g'(z)/g(z) is holomorphic on an open set containing cl(R,), so by Cauchy's theorem 
its contribution to the integral is 0). 


F 
There is another way to look at the integral = oe dz, which gives an alternative 
explanation for why it is an integer, as well as an alternative geometric interpretation 
of its value. To see this, start by rewriting the integral (using the chain rule (1.84) from 


Exercise 1.6) as 


b b 
1 (FD, 1 Oy Os, - 1 TAPS a. 1f1 
ani fO a rA fom “ za Fae 0 ma | i 


a a fey 


dw, 


that is, an integral of dw/w over the contour f  y, the image of y under f. Now note 
that the differential form dw/w has a special geometric meaning in complex analysis; 
namely, we have 


a = “d(log w)” = “d(log |w| + iarg w)”. 


We put these expressions in quotes since the logarithm and argument are not single- 
valued functions (see Section 1.15), so it needs to be explained what such formulas mean. 
However, at least log |w| is well-defined for a curve that does not cross 0, so when inte- 
grating over the closed curve f o y, the real part is zero by the fundamental theorem 
of calculus. The imaginary part (which becomes real after the division by 277i) can be 
interpreted intuitively as the change in the argument over the curve. That is, initially at 
time parameter t = a, we fix a specific value of argw = arg y(a); then as t increases 
from t = atot = b, we track the increase or decrease in the argument as we travel 
along the curve y(t); if this is done correctly (i. e., in a continuous fashion), at the end 
the argument must have a well-defined value. Since the curve is closed, the total change 
in the argument must be an integer multiple of 27, so the division by 27 turns it into an 
integer. The value of the integer has the intuitive meaning of “the total number of times 
the curve f o y goes around the origin.” 

This discussion leads us to another important concept, that of winding numbers. 
Given a closed curve T that does not cross 0, the above reasoning involving the differ- 
ential form dw/w, applied to the curve T instead of f o y, shows that an integral of the 


form 
i dz 
2mi J z 
T 
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carries the meaning of “the total number of times the curve y goes around the origin,” 
with the number being positive if the curve goes in the positive direction around the 
origin; negative if the curve goes in the negative direction around the origin; or zero if 
there is no net change in the argument. This number is more properly called the winding 
number of I around 0 (also sometimes referred to as the index of the curve around 0) 
and denoted Ind,;(0): 


1 fdz 
Ind;(0) = oni ri 
T 


More generally, we define the winding number of I around zg, denoted Ind;(Z9), as 


2m 
T 


dz 
Z- Zo. 


assuming that T does not cross Zo. This can be interpreted as the number of times the 
curve T “winds around” an arbitrary point Zp. 

To summarize the discussion above, we defined the notion of winding numbers and 
explained why the quantity = ; ra dz thatis the subject of the argument principle has 
the additional interpretation as the winding number of the curve f oy around 0. Note that 
the winding number is a topological concept of planar geometry that can be considered 
and studied without any reference to complex analysis. It is not very difficult to define 
it in purely topological terms without mentioning contour integrals and then show that 
the complex analytic and topological definitions coincide, but we will not pursue this 
here. Try to think what such a definition might look like. 


Theorem 1.49 (Rouché’s theorem). Assume that f, g are holomorphic on a region Q con- 
taining a circle y = C and the disc U enclosed by it (or, more generally, a simple closed 
contour y enclosing a region U). If |f(z)| > |g(z)| for allz € y, then f and f + g have the 
same number of zeros in U. 


Proof. Define f,(z) = f(z) + tg(z) for t € [0,1], and note that fọ = f and fı = f + g, and 
that the condition |f(z)| > |g(z)| on y implies that f, has no zeros on y for any t e€ [0,1]. 
Denote 


nL AEO 
t 2i , f(z) 


Z, 


which by the argument principle is the number of “generalized zeros” (zeros or poles, 
counting multiplicities) of f, in U. In particular, the function t > n, is integer-valued. 
If we also knew that it was continuous, then it would have to be constant (by the easy 
exercise: any integer-valued continuous function on an interval [a, b] is constant), so in 
particular we would get the desired conclusion that n, = no. 
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To prove continuity of n,, fix a number e > 0. Note that the function g(t,z) = 
fe (Z)/f,(z) is continuous, hence also uniformly continuous, on the compact set [0, 1] x y. 
Therefore there exists 6 > 0 such that if 0 < t,s < 1 satisfy |t - s| < 6, then |g(t, z) - 
g(s, Z)| < 27e/len(y) (recall that len (y) denotes the length of the curve y). It follows that 
for such t, s, we have 
1 27e 


1 
m-nls 5 eea -ssal -|de| < = Pno |dz| = €. 


This is exactly what is needed to show that t + n, is continuous. 


Rouché’s theorem has a rather amusing intuitive explanation (which I learned from 
the book [48]). The slogan to remember is “walking the dog.” Imagine that you are walk- 
ing in a large empty park containing at some “origin” point 0 a large lamppost. You start 
at some point X and go for a walk along some curve, ending back at the same starting 
point X. Let N denote your winding number around the lamppost at the origin—that is, 
the total number of times you went around the lamppost with appropriate sign. 

Now imagine that you also have a dog that is walking alongside you in some erratic 
path that is sometimes close to you, sometimes less close. As you traverse your curve C4, 
the dog walks along on its own curve C, which also begins and ends in the same place. 
Let M denote the dog’s winding number around the lamppost at the origin. Can we say 
that N = M? The answer is yes, we can, provided that we know the dog’s distance to you 
was always less than your distance to the lamppost. To see this, imagine that you had 
the dog on a retractable leash; if the distance condition was not satisfied, it would be 
possible for the dog to reach the lamppost and go in a short tour around it while you 
were still far away and not turning around the lamppost, causing an entanglement of 
the leash with the pole. 

The above scenario maps in a precise way to Rouché’s theorem using the following 
dictionary: the curve f o y represents your path; the curve (f + g) ey represents the dog’s 
path; g o y represents the vector pointing from you to the dog; the condition |f| > |g| 
along y is precisely the condition that the dog stays closer to you than your distance to 
the pole; and the conclusion that the two winding numbers are the same is precisely the 
statement of the theorem that f and f + g have the same number of generalized zeros in 
the region U enclosed by y (see the discussion above regarding the connection between 
the integral (277i) $, f'/f dz and the winding number of f o y around 0). 

I recommend spending a few minutes thinking about the above correspondence 
and making sure you understand it. You may forget the technical details of the proof of 
Rouché’s theorem in a few weeks or months, but I hope you will remember this intuitive 
explanation for a long time. 

Rouché’s theorem is an important tool both for numerically estimating the numbers 
of roots of polynomials and other functions in regions of interests and for theoretical ap- 
plications. One illustration of the power of Rouché’s theorem is given in Exercise 1.30. 


1.14 The open mapping theorem and maximum modulus principle —— 57 


In the next section, we also use Rouché’s theorem to prove two more well-known prop- 
erties of holomorphic functions, the open mapping theorem and the maximum mod- 
ulus principle. 


Suggested exercises for Section 1.13. 1.29, 1.30. 


1.14 The open mapping theorem and maximum modulus principle 


Theorem 1.50 (Open mapping theorem). Any holomorphic function that is not constant 
is an open mapping, that is, it maps open sets to open sets. 


Proof: Let f be holomorphic and nonconstant in a region Q. Fix an arbitrary Zọ € Q, 
and denote wọ = f (Zo). We need to show that f(Q) contains a neighborhood of wọ, that 
is, that there exists some 6 > 0 for which f (Q) > Ds(w,). The reason Rouché’s theorem 
can be brought into play is that the inclusion f(Q) > Ds(w,) amounts to the statement 
that for w € Ds(wp), the function f(z) — w has at least one zero; and we know that this 
is true for the function f(z) — wọ, so we are precisely in a situation in which we want to 
compare the number of zeros of two functions, where (if we restrict our point of view 
to what is happening in a small neighborhood of Zo) one function can be regarded as a 
perturbation of the other. 
To make this idea precise, define 


F(z) =f (Z) - Wg, 
Gy(Z) = Wo - W, 
hy(Z) = F(Z) + Gy (z) = f (z) - w. 


Let e > 0 be a number small enough so that the closed disc D<e(Zọ) is contained in Q and 
such that the point z = Zg is the only zero of F(z) in the disc D,(Z 9). (Such e exists by the 
property that zeros of holomorphic functions are isolated.) Now define 


6 = inf{|f(z) — wo| : Z € Dze(Zo)}. (1.64) 


By construction we have that 6 > 0 and |f (z) — wo| = 6 for z on the circle |Z — Zo] = €. This 
means that for any w € Ds(w), the condition |F(z)| > |G,,(z)| in Rouché’s theorem will be 
satisfied for z € 0D,(Z)). The conclusion is that the equation h,,(z) = 0 (or, equivalently, 
f(z) = w) has the same number in solutions as the equation f(z) = wọ in the disc D,(Z9). 
The latter equation has precisely one solution, the point z = Zo. Thus we have shown 
that for w € Ds(W,) with 6 defined in (1.64), there exists z € D,(Z)) such that f(z) = w. 
This was precisely what we needed to establish that f is an open mapping. 


Theorem 1.51 (Maximum modulus principle). Iff is anonconstant holomorphic function 
on a region Q, then |f| cannot attain a maximum on Q. 
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Proof. This follows immediately from the open mapping theorem. 


For an interesting application of the maximum modulus principle, see Section 3.5. 


1.15 The logarithm function 


The logarithm function can be defined as 
log z = log |z| + iargz 


on any region Q that does not contain 0 and where we can make a consistent, smoothly 
varying choice of argz as z ranges over Q. It is easy to see that this formula gives an 
inverse to the exponential function.” 

For example, if 


Q = C \ (-co, 0] 
(the “slit complex plane” with the negative real axis removed), then we can set 
Logz = log |z| + iArgz, 


where Arg z is defined as a choice of arg z that takes values in (—7, 7). The function Log z 
is called the principal branch of the logarithm, a kind of standard version of the log 
function that complex analysts have agreed to use whenever this is reasonably conve- 
nient. However, sometimes we may want to consider the logarithm function on stranger 
or more complicated regions. When can this be made to work? The answer is: when Q 
is simply connected. We further give two results making this notion precise, the first 
involving a situation where the logarithm exists and can be made unique in a relatively 
canonical way, and the second in a more general setting that forces us to accept a (mild) 
lack of uniqueness. 


Theorem 1.52 (Existence of the logarithm: first version). Assume that Q is a simply con- 

nected region with 0 ¢ Q,1 € Q. There exists a unique function F : Q — C with the 

following properties: 

i) Fis holomorphic in Q. 

ii) ef =z forallz € Q. 

iii) F(r) = logr (the usual logarithm for real numbers) for all real numbers r € Q suffi- 
ciently close to 1. 


2 Logarithms in complex analysis are a subtle concept. One common source of confusion is that the lan- 
guage used to refer to them is inconsistent with their properties: it is common to speak of “the logarithm 
function” when the use of the definite article is potentially at odds with the fact that a function satisfying 
the properties of a logarithm is not unique. 
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Proof. Uniqueness: if F and G are two functions satisfying the properties listed in the 
theorem, then since F(r) = G(r) for real r in a neighborhood of 1, we must have F = G 
by Corollary 1.36. 

Existence: we define F as a primitive function of the function z + 1/z, guaranteed 
to exist by Corollary 1.25. We can assume without loss of generality that F(1) = 0. We 
then have that 


£ (ce F) =e? _gr'(zje FO = e FO (4 _ 7/2) =0, 


—F(Z) F(z) 


so ze is a constant function. Since its value at z = 1 is 1, we see that e = Z, as 
required. Finally, let e be chosen small enough so that the interval (1-e, 1+e) is contained 
in Q. Then for r € (1- €,1+ e), the fundamental theorem of calculus gives that 


r r 
dx 
F(r) = F(1) + [r'a =0+ | == logr. 
1 
Note that, a bit counterintuitively, the conclusion that F(r) = logr in the theorem 
may not be satisfied for all positive real r € Q; see Exercise 1.33. 


Theorem 1.53 (Existence of the logarithm: second version). Assume that Q is a simply 
connected region with 0 ¢ Q. There exists a function F : Q — C with the following 
properties: 

i) Fis holomorphic in Q. 

ii) ef = zforallz € Q. 


The function F is unique up to an additive integer multiple of 27 in the following sense: if 
G is another function satisfying the same properties, then we have 


G(z) = F(z) + 2nik (1.65) 


for some integer k; conversely, any function G of the form (1.65) for some k € Z satisfies 
the same properties. 


Proof. Exercise 1.34. 


A function F with the properties given in Theorem 1.53 is called a branch of the 
logarithm function on Q. 

Next, we generalize the concept of a logarithm further by considering the following 
question: given a region Q anda holomorphic function f : Q — C, when can we “take the 
logarithm of f”? That is, does there exist a holomorphic function g for which e®”) = f (z)? 
An obvious necessary condition is that f must not have any zeros; this generalizes the 
requirement that 0 ¢ Q from Theorems 1.52 and 1.53. If Q is simply connected, then this 
is also a sufficient condition. The precise result, including the extent to which the choice 
of logarithm is unique, is as follows. 
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Theorem 1.54 (Existence of the logarithm of a function). Iff is a holomorphic function on 
a simply connected region Q and f + 0 on Q, then there exists a holomorphic function g 
on Q satisfying 


E0 = F(z). 


The function g is unique up to an additive constant of the form 27tik with integer k. 


Proof. The idea is to define g as a primitive function of the function z + f'(z)/f (z), then 
the reasoning is similar to the proof of Theorem 1.52. The details are left as an exercise 
(Exercise 1.35). O 


On a simply connected region Q, we can now define the power function z + z° for 
an arbitrary a € C by setting 


zt = eO, 
where F is some branch of the logarithm on Q.’ In the particular case a = 1/n with 
positive integer n, this has the meaning of the nth root function z + z'/”, which satisfies 


(any _ (enF@)" _ elt @) ao ag 
If f(z) = z" is an nth root function associated with some branch of the logarithm, then 
for any 0 < k < n — 1, the function g(z) = e”*/"f(z) will be another function satisfying 
g(z)" = z. Conversely, it is easy to see that those are precisely the possible choices for 
an nth root function. That is, nth root functions are unique up to multiplication by an 
arbitrary nth root of unity. 

Generalizing power functions further in a similar way as we did for the logarithm, 
if Q is a simply connected region, f is a holomorphic function on Q that has no zeros, g 
is a branch of the logarithm of f, and a € C is an arbitrary complex number, then the 
function h(z) = e%®® can be interpreted as the power function “f raised to the power 
a.” In particular, for a = 1/n (na positive integer), this function is usually referred to as 


(a branch of) the nth root of f and has the property that h(z)” = f(z). 


Suggested exercises for Section 1.15. 1.31, 1.32, 1.33, 1.34, 1.35. 


1.16 The local behavior of holomorphic functions 


In Section 1.3, we considered what the property of being holomorphic at a point Zg says 
about the local behavior of the function near the point, focusing on the case when the 


3 As with the phrase “the logarithm function,” saying “the power function” is somewhat misleading; 
it is more correct to say “a branch of the power function.” However, mathematicians are human and 
prone to employing mental shortcuts just like everyone else, so in practice, you will rarely encounter 
mathematicians in the real world employing such precise terminology. 
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derivative f'(z)) does not vanish. We now give a more complete analysis that covers a 
more general situation. As we will show in Theorem 1.57, for a function f holomorphic 
in the neighborhood of a point Zọ, we can canonically express the function as c + wk, 
where w is a new variable associated with z near Zo, which takes values in the unit disc. 
Thus, loosely speaking, f behaves locally “like a power function w + w”".” 

We say that a holomorphic function f : 2 — C is locally injective near a point 


Zo € Q if there is a neighborhood U of z, such that the restriction of f to U is injective. 


Lemma 1.55. Let f : Q — C be a holomorphic function, and let Zo € Q. If f'(Z) + 0, then 
f is locally injective near Zp. 


Proof. Denote A = f'(Zọ) and e = IA|?. Denoting (z,w) = Re(zw) (the standard inner 
product in the plane), we have (f' (Zo), A) = |A|? > €/2, and therefore by continuity also 
(f'(z),A) > e/2 for all z in some disc D5(z,). Let z, and z, be distinct points in D,(Zo). 
Then we have 


Z 1 


f (22) -f (z1) = [ro dz = (Z, - Z1) [re + t(Z, - 24)) dt. 


Z 0 
This implies that 


1 


(Z2 — 21) f (22) - f (24), A) = (22 - z(e -z4) [re + t(Z3 - z4)) aa) 
0 
1 


= |Z, - zl? [rc + t(Z3 —2Z,)),A) dt = iz -z >0. 
0 


In particular, f (Z3) — f (Z1) + 0. 


The next two classic results are both important consequences. 


Theorem 1.56 (Inverse function theorem). Let f : Q — C be a holomorphic function. Let 
Zo € Q, and denote wy = f (Zo). Assume that f' (Zo) + 0. Then f has a local holomorphic 
inverse. More precisely, there exist an open neighborhood U of Zp, an open neighborhood 
V of wọ, and a holomorphic function g : V — U such that: 

1. f maps U bijectively onto V; 

2. g maps V bijectively onto U' 

3. g =f 1 (in the set-theoretic sense of an inverse function); 

4. The derivative of the inverse function g is given by 


g' (Wo) = (1.66) 


= 
f' (2) 
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Proof: By Lemma 1.55, f maps an open neighborhood U = D;(Z)) bijectively into V = 
f(U). By the open mapping theorem, V is an open neighborhood V of wọ. Since the re- 
striction fiy : U — V off to U is continuous and open, it is a homeomorphism. Denote 
its inverse by g : V — U. To see that g is holomorphic at Zp, observe that 


gW - 8) _ yo, SF) - BUF i Z- 


i 
wm WoW, a f(2)-fla) afe- fE) 
a fOfay\ 1 
- (im eo) - , 1.67 
(im Z- Zo ) f' (Zo) (1.67) 


which also gives formula (1.66). Similarly, replacing Zọ and wọ in (1.67) by an arbitrary 
pair of points z, € U and w, = f (z1) proves that g is holomorphic on all of V. 


Theorem 1.57 (Local behavior of holomorphic functions). Letf be holomorphic inaregion 
Q. Let z € Q, and let k > 1 denote the order of the zero of f(z) — f (Zo) at Zo. Then there 
exist an open neighborhood U of zọ, a number r > 0, and a function 9 : U — D,(0) such 
that: 

1. gis holomorphic and bijective, and the inverse function 9 is also holomorphic;* 


2. p(Zo) = 0; 
3. We have 


f(z) =f) +Z)" zeU). (1.68) 


In other words, under the change of variables w = (z), the function z +> f(z), z € U, 
is represented as w > f (Zo) + wk, w € D,(0), in terms of the new variable w. 


Proof. By the definition of k we can represent f as 
F(Z) = f (Zo) + (Z - Z0)“ g2) 


with g holomorphic and g(Zọ) + 0. Since zeros of holomorphic functions are isolated, g 
is also nonzero in some disc D,(Zọ), so by the discussion about nth roots at the end of the 
previous section we can express g as 


g(z) =h(z)* (Z € De(Zp)) (1.69) 


for some function h that is holomorphic (and also nonzero by (1.69)) in D,(Zq). If we now 
define 


H(z) = (Z — Zo)h(2), 


then we see that f(z) can be expressed as 


4 A function with these properties is called a biholomorphism; see Chapter 3. 
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f(z) =f(Z) + (Z- Zo)h(z))* =f (Zp) + H(2)*, (1.70) 


a representation that is similar to (1.68), but not yet with the correct domain and range 
claimed in the theorem. Note that H(z,) = 0 and H’ (Zg) = h(Zo) + 0. By Lemma 1.55, H is 
locally injective near Zo, that is, its restriction Hjp,:,,) toasmaller disc Ds(Zp) for some ô € 
(0, €) is injective. The restricted function, being holomorphic, is also an open continuous 
mapping, so it maps the disc Ds(Z)) homeomorphically to some open set V containing 
H(zo) = 0. Let rr > 0 be such that D,(0) c U, and denote U = (Hip z) (D,(0)). Then 
g := Hy (the further restriction of H to U) maps U bijectively and homeomorphically 
onto D,.(0), and its inverse is holomorphic. The above remarks together with (1.70) show 
that it satisfies the properties claimed in the theorem. The proof is complete. 


Corollary 1.58. A holomorphic function f : Q — C is locally injective near Zo € Q if and 
only if f' (Zo) + 0. 


Suggested exercises for Section 1.16. 1.36. 


1.17 Infinite products and the product representation of the sine 
function 


Complex analysis abounds in esthetically appealing identities involving integrals and 
infinite sums. We will also encounter a variety of beautiful identities involving infinite 
products. In this section, we develop the basic theory of such products and illustrate it 
in one particularly elegant example, the infinite product identity for the sine function. 


1.17.1 Infinite products of complex numbers 


Let (c,)p2, be a sequence of complex numbers. The infinite product J JPS} c, is defined 
as the limit of finite (partial) products limy_,., Ths Cx if the limit exists. In that case, 
we say that the product []/-, Cn converges. 


Proposition 1.59. For a sequence of complex numbers (ay)r24, if $ pc lanl < co, then the 
infinite product J [< 4(1 + an) converges, and its value is 0 if and only if one of the factors 
1+, is equal to 0. 


Proof: Under the assumption, there exists some large enough Ny > 1 such that |a,| < 1/2 
for all n > Nọ. This implies that 1+ a, = exp(Log(1+ a,)), where Log(z) is the principal 
branch of the logarithm function. Now by the Taylor expansion of the function z + 
Log(z) (Exercise 1.31) there is some constant C > 0 such that 


|Log(1+w)|<Clw| if |w] < 1/2. 
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It follows that 


Cc Cc 
$ |Log(1 + a„)| < C y janl < œ, 
n=Ny n=Ny 


so in particular, the series Dnem Log(1 + a,) converges. We can now write 


(ee) N N 
[ [G@+a,) = lim [[@+a,)= lim |] exp(Log(1+a,)) 
n=No N> n=Np N-oo n=Np 
N N 
= lim ex $ Log(1 + a) = eal lim J Log(1+ a) 
N- co n=Np N-co n=Np 
= eal $ Log(1 + w) 
n=No 


Thus we have proved that the infinite product Miem (1+ an) converges, and, moreover, 
it converges to a nonzero value. Therefore, trivially, the product Miem (1+ a„) also con- 
verges and is equal to zero if and only if one of the factors 1 + a, for 1 < n < No is 
Zero. 


1.17.2 Infinite products of holomorphic functions 


Proposition 1.60. Let (f,);2, be a sequence of holomorphic functions on a region Q. If the 
series Xp- |f,| converges uniformly on compacts in Q, then the infinite product F(z) = 
Tre + fn(z)) also converges uniformly on compacts. The limiting function F(z) is holo- 
morphic and is nonzero everywhere except at the points z for which 1 + f,(z) = 0 for 
some n. 


Proof. Proposition 1.59 implies that the infinite product []?°,(1 + f,(z)) converges to a 
nonzero limit for any z € Q. By repeating the same estimates in the proof of that propo- 
sition in the context of z being allowed to range on a compact subset K c Q, we see 
that the sequence of partial products [];_,(1+ f,) actually converges uniformly on com- 
pacts, so the limiting function is holomorphic. The claim about the set of points z for 
which F(z) = 0 is an immediate consequence of the corresponding condition in Propo- 
sition 1.59. 


Proposition 1.61. Under the assumptions of Proposition 1.60, the logarithmic derivative 
of the infinite product []7-, (1+f,) is the sum of the logarithmic derivatives of the individual 
factors, that is, 


Talt Sf 
Mra +f alti, 


(1.71) 
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Moreover, the infinite series in (1.71) converges uniformly on compacts in the set {z € Q : 


Tri + fn(Z)) # 0}. 
Proof. Exercise 1.37 


1.17.3 The sine function 


As an illustration of the theory of infinite products, we prove the following classic result. 


Theorem 1.62 (Infinite product formula for the sine function). 


co 2 
sin(7z) = Z H(2 - =) (z € ©). (1.72) 
n=1 


Theorem 1.62 often comes up in an equivalent form of an infinite series identity, 
obtained by taking the logarithmic derivative of both sides of (1.72). This result is known 
as the partial fraction expansion of the cotangent function. 


Theorem 1.63 (Partial fraction expansion of the cotangent function). The rescaled cotan- 
gent function n cot(z) has three representations 
1 


N 
mcot(mz) = lim $ = 
N-oo AEN Z+n 


1 
=+ 
Z 


1 1 1 S 2 
= —+ ; 1.73 
Zin 1) Lpi we) 


2 
valid for all z € C \ Z. 


The equivalence of the three sums in (1.73) and the convergence of the respective 
expressions are easy to verify (Exercise 1.38). The first of the three formulas is sometimes 
written in the form of the infinite series 


© i 
PV. $ FE (1.74) 


n=-co 


with the caveat that this is to be interpreted in the “principal value” sense, where the 
summation is performed symmetrically on positive and negative indices. This also gives 
a bit of intuition of why we expect an identity such as (1.74) to hold: the series (1.74), as- 
suming that we can make sense of it as defining a genuine function, is periodic with 
period 1, and its local behavior around z = n for each integer n is the correct principal 
value of the function 7 cot(zz) around that point, namely the simple pole E This in- 
tuition is not quite a proof, but can be turned into one with some additional arguments 
(see [3, Ch. 26]). Here we give a more complex-analytic proof based on contour integra- 
tion. 
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Figure 1.10: The integration contour yy and the poles (as a function of w with z fixed) of the integrand 


_ mcot(mw) 
fw) = (w+z)2 ` 


Proof of Theorem 1.63. Letz € C \ Z. Fix a large positive integer N. We use the residue 
theorem to evaluate the contour integral 


n cot(zw) 


N= Oa De 


Yn 


over the contour yy going in the positive direction around the rectangle with vertices 
(+(N + 1/2), +N); see Fig. 1.10. The integrand f,(w) = TP has at its poles enclosed by 
the contour the points w = —z (assuming that N is chosen large enough) and w = k € Z, 


-N < k < N. The residues are evaluated without much difficulty as 


2 


T 

Res_,(f,) = -- 75 > 

Wie sin?(7z) 

1 
R =—  (-N<k<WN), 
es; (fz) +b? ( ) 
so the residue theorem gives that 
2 N 1 


Iy(Z) = 2ni| - (1.75) 


p (z+k)2 | 


ee 
sin (7z)  kEN 
Now consider what this means in the limit as N — oo. We claim that 


Iy(2) N-oo 0, 
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which, together with (1.75), would imply the identity 


2 (oe) 


2 1 
sin2(z7z) a pa (z+ n)? (z € C\ 2). 


To prove this, first note the auxiliary identities 


' E ' 
[sin(x + ty)" = sin? x + sinh? y, 
(x,y € R), 
|cos(x + i ? = cos? inh? 
iy)|" = cos” x + sinh’ y, 
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(1.76) 


which we leave to the reader to verify. Taking x = +7(N + 1/2) and y arbitrary, these 


identities imply the bound 


pee, 
cot( en( w + >) +riy)| = a D <1, 
2 1+ sinh’ (zy) 


and similarly, for y = +N and x arbitrary, we have 


= 459 
1+ sinh (aN) ay 


|cot(mx + miN)| < ——5 
sinh (zN) 


(1.77) 


(1.78) 


The bounds (1.77)-(1.78) together show that on the contour yy, the integrand f,(w) is 


bounded in magnitude by T which implies that 


> 


107N 
(NIZ? N> 


as claimed, proving (1.76). 
Finally, to derive (1.63), let 


F(z) = z cot(zz), G(z) = r yX 2z > 
Z n=1 n 


2 
Note that 
2 
F'(z) = -—.—., 
sin (7z) 
1 ,& +r t-g 1 1 
G2) =-5 25 2_ p22 T >( zY z) 
Z fay (z4 — n’) z sa\(z+ny (z-n) 
265 1 
n£ (Z +n}? 


so that, by (1.76), F'(z) = G'(z). It follows that F(z) = G(z) + c for some constant c. 


However, F and G are both odd functions, so we must have c = 0,i.e., F = G. 
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Proof of Theorem 1.62. Define the holomorphic functions 
(ee) z 
S(z) = sin(zz), T(z) =7z H(2 — a 
n=1 n 
noting that the convergence of the infinite product to a holomorphic function is justified 
by Proposition 1.60. Taking logarithmic derivatives, we see (using (1.71) that 


S'(z) _ TZ 1 & z 
S(z) ONE) T(z) z j 2 z? -n2 


(ze C\Z). 


n=1 


By (1.73) we therefore see that S'/S = T'/T or, equivalently, (S/T)' = 0. It follows that 
S = CoT for some constant cy. Rewriting this in the form 


sin(7z) 2 ( z ) 
=C 1 
TZ o IT n2 


and taking the limit as z — 0 show that cy = 1 and finish the proof. 


Aside from being a remarkable result in its own right, Theorem 1.62 has a number 
of interesting consequences, discussed in Exercise 1.39. We will also use this result (in 
the equivalent form (1.73)) several times in our studies of modular forms in Chapter 5. 


Corollary 1.64. We have the infinite product formulas 


o0 z 
cos(71z) = H(2 a <a) (1.79) 
Zz _ 2/2 > z 
e -1=ze II 1+ I) (1.80) 


Proof. Exercise 1.40. 


Suggested exercises for Section 1.17. 1.37, 1.38, 1.39, 1.40, 1.41, 1.42. 


1.18 Laurent series 


A Laurent series is a generalization of a power series expansion and takes the form of 
a two-sided infinite series 


0 


f@)= } anle-29)" (1.81) 


n=-cCo 


for some Zo € C and complex coefficients (a,,)r°_,,- Given such a series, it is easy to see 
that it converges absolutely and uniformly on compacts in the annulus-shaped region 
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Ar R(Zo) = {Z : r < |Z -Zol < R}, 


where Ris the radius of convergence of the power series $} o anz”, andr is the reciprocal 
of the radius of convergence of the power series } f°} a_,w”. Note that this region can 
be empty, e. g., ifr =co,R=0,orr>R. 

The basic question about when a function can be expressed as a Laurent series is 
answered by the following theorem. 


Theorem 1.65. Let 0 < r < R < oo. Let f be holomorphic in a region Q containing the 
annulus A, g(Zo). Then f (z) has a unique representation as a Laurent series (1.81), which 
is absolutely convergent uniformly on compacts on A, g(Zo). The coefficients a, are given 
by 


asl f(z) 
an ari z Z-z)" dz (neZ) (1.82) 
p\Zo 


with arbitrary p € (r, R). 


Proof. Uniqueness: Given an expansion of the form (1.81) known to converge absolutely 
uniformly on compacts on A, g(Zo), let p € (r, R), and observe that 


1 f(z) il 1 oœ Ee 
2ni (Z — Zo)"* % 27i $ (2=2,)™ ex An (Z - Zo) ) dz 
C, (Zo) 


Cp (Zo) 
= 5 an( 5 GeT dz) =a 
Hes 2m 


Cp (Zo) 


(The last step uses the standard formula (1.92) from Exercise 1.21.) Thus the a, are deter- 
mined uniquely and are given by (1.82). 

Existence: Fix Z € A, p(Zo). Take numbers p_ and p,, withr < p_ < |Z -Zol < P4 < R. 
Then,by a standard limiting argument involving a keyhole contour we can show that 


fW g 1 fw) 
NAS m$ n S ; waz" 


In this representation the factors — inside the two integrals can be expanded as geo- 
metric series (in two different ways, one being valid for w € C, (Zo) and the other for w 
on the circle Coy (Zo)). This leads to 


1 1 
torm $ E Ena 


70 —— 1 Basic theory 


k ~ 1 fw) n 
= l d moa aw Jz Zo) 


Cy_ (Zo) 


where in the last step, we interchanged the summation and integration; this is easy to 
justify, since the geometric series converge uniformly on the integration contours. We 
have therefore obtained a representation for f(z), which we see is of the form (1.81) with 
the coefficients a, given by 


L LO) ap HBAS, 
mi (w — Zo)"™*! 


Cp, (Zo) 


ds 
g K d fw) dw ifn<0. 
2zi (w — Zo)™! 


Cp_ (Zo) 


(1.83) 


Finally, observe that (1.83) is equivalent to (1.82), since, by another application of 
Cauchy’s formula on an appropriate keyhole contour, it can easily be shown that the 
integral on the right-hand side of (1.82) is independent of the radius p of the integration 
contour. 


Suggested exercises for Section 1.18. 1.43, 1.44. 
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Exercises for Chapter 1 


11 


1.2 


1.3 


1.4 


1.5 


An immediate corollary of the fundamental theorem of algebra is that any complex 
polynomial 


p(Z) = 4,2" + ap 4Z" +--+ ao 


(where ap, ..., a, € C and a, # 0) can be factored as 


p(z) =a | [2 - 2) 
k=1 


for some Z;,...,Z, € C; these are the roots of p(z) counted with multiplicities. Use 
this to prove that any such polynomial where the coefficients dg, ..., a, are real has 
a factorization 


P(Z) = AnQ1(Z)Qo(Z) .-. Qm(2), 


where each Q,(z) is a linear or quadratic monic polynomial (i. e., is of one of the 
forms z — c or z? + bz + c) with real coefficients. 
If p(Z) = aZ” + Anz" 1 + --- +++» +p is a polynomial of degree n such that 


n-1 
lanl > > lajl, 
j=0 


then prove that p(z) has exactly n zeros (counting multiplicities) in the unit disc D. 
Guidance. Use the fundamental theorem of algebra. 

Note. This is a particular case of a less elementary fact, which can be proved using 
Rouché’s theorem; see Exercise 1.29. 

For each of the following functions, determine where it is holomorphic. 

a. f(z) =z c. f(z) = |z| e. f(z) =Z 

b. f(z) = Re(z) d. f(z) = |z/? f. f(z)=1/z 

For each of relations (1.11)-(1.15) in Lemma 1.4, explain precisely what holomor- 
phicity assumptions are needed for the relation to hold and prove its correctness 
under those assumptions. 

Draw (approximately, or with as much precision as you can) the image in the 
w-plane of the following figure in the z-plane 
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under each of the following maps w = f(z): 


a. w= 32 Cc. W=Z e w=1/z 
b. w=iz d. w=(2+0z-3 f. w=z-1 
1.6 Letf be a holomorphic function in a region Q, and let y : (a,b) — C bea differen- 


tiable parameterized curve in Q. Prove that 


“¢ (y(t))) =f’ (v)y' (0. (1.84) 


1.7 Complete the proof of Lemma 1.7 by proving the remaining implications (b) <=> (c) 
and (b) => (a), which were not proved in the text. 

1.8 For each of the following functions u(x, y), determine if there exists a function 
v(x, y) such that f(x + iy) = u(x,y) + iv(x, y) is an entire function, and if so, then 
find it and try to find a formula for f(z) directly in terms of z rather than in terms 
of its real and imaginary parts. 

a u(x%,y)=x?-y" c. u(x y) = x* - 6x’y? +3x4+yt-2 
b. ux y) =y? d. u(x, y) = cosx coshy 

1.9 Alternative form of the Cauchy-Riemann equations. A function f = u + iv of 
a complex variable z = x + iy is traditionally thought of as a function of the two 
coordinates x and y. However, if we think of the equations 


Z=X+İy, Z=x-ly 


as representing a formal change of variables from the “real coordinates” (x, y) to 
the “complex conjugate coordinates” (z, Z), then it may make sense to think of f as 
a function of the two variables z and Z (pretending that those are two independent 
variables). Thus we may suggestively write u = u(z, Z) and v = v(z, Z) and consider 
operations such as taking the partial derivatives of f, u, v with respect to z and Z. 
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Show that from this point of view, the Cauchy—Riemann equations 


ou ðv du ðv 
əx dy dy ax 


can be rewritten in the more concise equivalent form 


of _ 
=e 


> 


assuming that it is okay to apply the chain rule from multivariable calculus; and, 
moreover, that in this notation, we also have the identity 
of 
t 
Zz) = Z. 
ro=-2 
1.10 Letf : Q — C be a function defined on a region Q such that both functions f(z) and 
zf (z) have real and imaginary parts that are harmonic functions. Prove that f(z) is 
holomorphic on Q. 
1.11 Let p(z) = a,2z"+a,_,2" +- - -+ aq be a complex polynomial of degree n > 2 (that is, 
ao»... An € Cand a, + 0), and let z,,...,Z, be its roots counted with multiplicities. 


Let w4, .. ., W1 denote the roots of p’(z). Prove the following claim, known as the 
Gauss-Lucas theorem. 


Theorem 1.66 (Gauss-Lucas theorem). The points w,,...,W,_ all lie in the convex 
hull of Z1,...,Z,, that is, each wọ can be expressed as a convex combination 


we = az, + oz, 4-0 z, 
for some coefficients a\",..., a > 0 satisfying Xj- a =1 


See Fig. 1.11 for an illustration of this phenomenon. 


Z2 


Figure 1.11: An illustration of the Gauss-Lucas theorem discussed in Exercise 1.11, showing the roots 
Zi». . ., Z7 Of a polynomial p(z) of degree 7, their convex hull, and the roots w4, . .. , Wg of p' (Z). 
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1.12 Ilustrate the claim from p. 19 regarding the orthogonality of the level curves of the 
real and imaginary parts of holomorphic functions by plotting some of the level 
curves of Re(f) and Im(f) for each of the following functions: 

a. f(z) =z" b. f(z) =1/z c f(z) =e 

1.13 Complete the argument of the proof of Lemma 1.8 in the extreme cases R = 0, co. 

1.14 Using the formula e” = $9 z as the definition of the exponential function, prove 
that 


e “=e € (w,zeC). 


1.15 The Bernoulli numbers. The Bernoulli numbers are the numbers (Bn)?co defined 
by the power series expansion 


Z y Bn on (1.85) 


For example, the first three Bernoulli numbers are By = 1, B4 = —1/2, and B, = 1/6. 
(a) Find the radius of convergence of the series (1.85). 
(b) Prove that the Bernoulli numbers satisfy the following identities: 


Ba =0 (k=1,2...), (1.86) 
ti /n+1 
(n+ DB, = - B, (n>), (1.87) 
2 k ) j 
n-1 2n 
(2n +1)By, =- > (5), )BoxBan- x (n22, (1.88) 
5 eoth( 4 )- $ Bon „n (189) 
ONTA m i 


Hint for (1.88). Show that the function g(z) = f(z) + z/2 satisfies the ordinary 
differential equation g(z) - zg' (z) = g(z)}? - z°/4. 
(c) Prove that 


ae (2n) 1 
==. (1.90) 


lim sup| —— F 


n> 


om ! 


(See also Exercise 1.39, where we will derive a much more precise formula for 
the asymptotic behavior of B,,, for large n.) 
1.16 Bessel functions. The Bessel functions (also known as Bessel functions of the 
first kind) are a family of functions (J,,)°_. of a complex variable, defined by 


2k+n 


EDE (z 
In (2) = ea, ; (1.91) 
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(For example, note that J)(—2 Vx) = È kco a which is reminiscent of the exponen- 
tial function and already seems like a fairly natural function to study.) 

(a) For which z e C does the series (1.91) converge? 

(b) Prove that the Bessel functions satisfy the following properties: 


J-n(Z) = (-1)"Jn (z), 
Ines) = ZRD -J2 


ji Ly ae 
Jn (z) = -zh T a Inf). 


(c) Prove the following additional identities: 


(p Zaer 


n=-00 


exp 


cos(z sin t) = Jo(Z) + 2 AA cos(2nt), 


n=1 


sin(z sin t) = 2 A sin((2n + 1)t), 


n=0 


cos(z cost) = Jo(z) + 2 S (A onl2) cos(2nt), 


n=1 
sin(z cos t) = 2 5 (-1)"Jony1(Z) cos((2n + 1)t), 
n=0 


In(Z) = f cost sin t — nt) dt. 
0 


1.17 Show that, analogously to the calculation in (1.35), the arc length integral (1.34) does 
not depend on the particular parameterization chosen for the curve y. 

1.18 Prove Proposition 1.12. (Part of the exercise is to define precisely the notions of 
“composition of curves” and “reverse curve”). 

1.19 Prove that homotopy of curves defined at the beginning of Section 1.8 is an equiv- 
alence relation. 

1.20 Prove that C is simply connected. 

1.21 (a) Forr>0Oandne Z, show that 


‘i ami ifn = -, 
Pde = (1.92) 


0 otherwise. 
|z|=r 


(b) Forwhichn «€ Z does the function f(z) = z” havea primitive in C\ {0}? Explain. 
(c) Is the “punctured complex plane” C \ {0} simply connected? Explain. 
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1.22 


1.23 


1.24 


Cauchy’s theorem and irrotational vector fields. Recall from vector calculus that 
a planar vector field F = (P,Q) defined on some region Q c C = R° is called 
conservative if it is of the form F = Vg = &, 2) (the gradient of g) for some scalar 
function g : Q — R. By the fundamental theorem of calculus for line integrals, for 


such a vector field, we have 
bE -ds = 0 
y 


for any closed curve y. Recall also that (as is easy to check) any conservative vector 
field is irrotational, that is, it satisfies 


curlF = 0, 


where, in the context of two-dimensional vector fields, the curl operator is defined 
by 


The following converse to this result can be shown: if the region Q is simply con- 
nected, then a theorem in vector calculus says that an irrotational vector field is 
also conservative. 

Use these background results to show that if f = u + iv is holomorphic on a simply 
connected region Q, then 


poz: 0 


for any closed curve y in Q. (This is, of course, Cauchy’s theorem.) 

Show that Liouville’s theorem (Theorem 1.33) can be proved directly using the “sim- 
ple” (n = 0) case of Cauchy’s integral formula, instead of using the case n = 1 of the 
extended formula as we did in the lecture. 

Show that Liouville’s theorem can in fact be deduced even just from the mean value 
property of holomorphic functions (Theorem 1.28), which, as you may recall, is the 
particular case of Cauchy’s integral formula in which z is taken as the center of the 
circle around which the integration is performed. 


Guidance. Here it makes sense to consider a modified version of the mean value 
property (that follows easily from the original version) that says that f(z) is the 
average value of f (w) over a disc Dp(z) (instead of a circle Cp(z)), that is, 


1 y 
fl) = <5 {| fox + iy) dxdy, 


Dp(Z) 


1.25 


1.26 


1.27 


1.28 


1.29 


1.30 
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where the integral is an ordinary two-dimensional Riemann integral. Explain why 
this formula holds, then use it to bound |f(z,) — f (z2)| (for arbitrary complex num- 
bers Z4, Z2) from above by a quantity that goes to 0 as R > oo. 

Prove the following generalization of Liouville’s theorem: let f be an entire function 
that for all z € C satisfies the inequality 


f(z)| < A + Blz|" 


for some constants A, B > 0 and integer n > 0. Then f is a polynomial of degree at 
most n. 

Integration of a family of holomorphic functions with respect to a parameter. 
Let J c R be an interval, and let Q be a complex region. Let F(t, z) be a function of a 
real parameter t € I and a complex variable z € Q. Assume that F(t, z) is continuous 
on T x Q, holomorphic in z for any fixed t € J, and that for any compact set K c Q, 
SUP eK Í |F(t, z)| dt < oo. Prove that the function f : Q — C defined by 


f(z) = [rea dt 


I 


is holomorphic on Q. 

(a) Explain how to derive the formulas (1.52)-(1.53) through purely formal alge- 
braic manipulations. Are these manipulations valid in any sense you are fa- 
miliar with from real analysis? 

(b) Explain how the principle of analytic continuation can breathe new life into 
the two formulas by providing a context within which the formulas can be 
interpreted as having a precise, well-defined (and correct) meaning. 

Prove properties (1.62)-(1.63) of the generalized order of a zero of a holomorphic 

function at a point Zọ. Can you give a useful condition for when equality holds 

in (1.62)? 

If p(z) = a,2"+d,_42" 1+---+-+-+dp is a polynomial of degree n such that for some 

0 < k <n, we have 


lal > J laji, 


ee 
then prove that p(z) has exactly k zeros (counting multiplicities) in the unit disc D. 
Guidance. Use Rouché’s theorem. 
Note. This result generalizes the result of Exercise 1.2. 
Show how Rouché’s theorem can be used to give yet a proof of the fundamental 
theorem of algebra. This proof is one way to make precise the intuitively compelling 
“topological” proof idea discussed in Section 1.2. 
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131 


1.32 
1.33 


1.34 
1.35 
1.36 
1.37 
1.38 


1.39 


Prove that the principal branch of the logarithm has the Taylor series expansion 
foe) (14 i 
Logz= 9 a (z-1) (|z-1| <1) (1.93) 
n=1 
around z = 1. 
What is (or are) the complex number (or numbers) represented by i'? 


(a) Draw a simply connected region Q c C such that 0 ¢ Q, 1,2 € Q, and such that 
there exists a branch F(z) of the logarithm function on satisfying 


F()=0, F(2) =log2+ 27i 


(where log 2 is the ordinary natural logarithm of 2 in the usual sense of real 
analysis). 

(b) More generally, let k € Z. If we were to replace the above condition F(2) = 
log 2 + 27i with the more general condition F(2) = log 2 + 27tik but keep all the 
other conditions, would an appropriate simply connected region Q = Q(k) ex- 
ist to make that possible? If so, then what would this region look like, roughly, 
as a function of k? 

Prove Theorem 1.53. 

Prove Theorem 1.54. 

Prove Theorem 1.56. 

Prove Proposition 1.61. 

Prove that the three infinite series in (1.73) all converge for z € C \ Z and represent 

the same function. 

Consequences of the infinite product formula for the sine function. 

(a) By specializing the value of z in (1.72) to an appropriate specific value obtain 
the following infinite product formula for 7, known as Wallis’ product (first 
proved by John Wallis in 1655): 


22446 6 8 8 


I er be ee ae 


(b) By comparing the first terms in the Taylor expansion around z = 0 of both 
sides of (1.72), derive the well-known identities 


a r —=1 x 
>. ere $ =S (1.94) 


(c) More generally, we can use (1.72) or, more conveniently, its equivalent cousin 
(1.73) to obtain closed formulas for all the series 


2k) =- Y 1 =1 1 1 l k =1,2 
C S * dan * 32K * QR 7°" (k= 1,2,...). 


n=1 
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(The notation ¢(2k) for these infinite sums is standard and has to do with 
the fact that these are the special values of the Riemann zeta function ¢(s) = 
ar 4 at the positive even integers; see Chapter 2.) To see this, expand both 
sides of the relation 


reotinz)=*+ Y ( 1 z) (ze C\z) 
Z Æ&\Z+n n 


n0 


in a Taylor series around z = 0, making use of identity (1.89) from Exercise 1.15. 
Compare the coefficients and simplify to get the famous formula 


URS 2(2k)! 


By, (k21). (1.95) 


For example, using the first few values B, = 4, B=- P B; = P and Bg = - a 


we get 
oo 1 T 
¢(2) = T aye 
2 n 6 
oo 1 né 
4 = — = —, 
ÇA) 2 e 
oo 1 nê 
((6)=) =E 
2 nê 945 
oo 1 78 
8)= Y = =—_, 
Dig 9 450 


where of course the first two values coincide with those from (1.94). 
(d) Show that ((2k) = 1+ 0(2°**) as k > oo and deduce that the asymptotic 
behavior of the Bernoulli numbers is given by 


k-1 2(2k)! 
(2zr)2k i 


By, = (1+ O(2-*))(-1) 


Note that this is consistent with the earlier weaker estimate (1.90). 
1.40 Prove identities (1.79)—(1.80). 
1.41 (a) Prove the infinite product formula 


sin(Z) = Tl] cos( =) (z € ©). (1.96) 


Hint. sin(z) = 2 sin(z/2) cos(z/2). 
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(b) By substituting an appropriate value of z into (1.96) prove the formula 


2 v \2+ v2 p+ ibe vt es ox yav 
SaS 5 F 5 : 5 Day 


TT 


first proved by François Viète in 1593. 
1.42 Evaluate the following infinite products: 


œ n’-1_ 3.8 15 24 48 _9 
@ [he a9 ag aen 
co nti 2,5,10,17, 26. -? 
(b) Thea w ~1°4°9 6°25 Eo 


Later, in Chapter 2, we will encounter an interesting variation on these infinite 
products; see Exercise 2.17. 

1.43 Let f be holomorphic in a punctured neighborhood of a point Zo € C. Assume that 
f has a pole of order k at zy. Show that the Laurent series (1.81) in this case takes 
the form 


CO 


f@ = È a,(z- 2)". 


n=-k 


1.44 Let f(z) = OD z: BY Theorem 1.65, f(z) has a Laurent series (1.81) that converges in 
the punctured disc {0 < |z| < 2}, and separately from that, f(z) has a Laurent series 
that converges in {2 < |z| < oo}. Find the coefficients a, explicitly for both those 
Laurent series. 

1.45 Let f(z) = p(z)/q(z) be a rational function such that deg q > deg p + 2 (where deg p 
denotes the degree of a polynomial p). Prove that the sum of the residues of f(z) 
over all its poles is equal to 0. 

1.46 Sendov’s conjecture, an elementary statement in complex analysis proposed by 
the mathematician Blagovest Sendov in 1959 and still open today, is the claim that 
if p(z) = (z—2,)...(Z-Z,) isa complex polynomial whose roots z js Z;,j =1,...,n,alllie 
in the closed unit disc |z| < 1, then for each root Zj, there is a root a of the derivative 
p'(z) for which |z; - a| < 1. 

(a) Prove the conjecture for the case n = 2 of quadratic polynomials. 

(b) Prove that if in the inequality |z; - a| < 1, the number 1 is replaced by any 
smaller number, then the claim is false. 

(c) Prove the conjecture for the case n = 3 of cubic polynomials. (This is not a 
completely trivial result; for one possible proof, see [11].) 

1.47 Use Cauchy’s theorem and the residue theorem to calculate the following definite 
integrals: 

o0 


(a) | e” dx = VT. 


—0o 


(b) | eT bmx gy — e ™ (ueR). 
eT 

(c) [sne )dt = f cost )dt = a 
0 0 

1 

@) | cosh(x) pre 
T 1 2riux = 1 

@) | cosh(zx) ~ cosh(zu) ich: 
T 1 iux —|ul 

(£) | =e dx = ne! (u€ R). 
T e T 

6) | Ire sin(ztu) ised) 
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2 The prime number theorem 


My distinguished friend: 


Your remarks concerning the frequency of primes were of interest to me in more ways than one. 
You have reminded me of my own endeavors in the field which began in the very distant past, in 
1792 or 1793, after I had acquired the Lambert supplements to the logarithmic tables. Even before 
I had begun my more detailed investigations into higher arithmetic, one of my first projects was 
to turn my attention to the decreasing frequency of primes, to which end I counted the primes in 
several chiliads and recorded the results on the attached white pages. I soon recognized that behind 
all of its fluctuations, this frequency is on the average inversely proportional to the logarithm, so 
that the number of primes below a given bound n is approximately equal to 


Í dn 
logn’ 
where the logarithm is understood to be hyperbolic. 


Carl Friedrich Gauss, letter to Johann Encke dated December 24, 1847 


2.1 Motivation: analytic number theory and the distribution of 
prime numbers 


Humans have been fascinated by the prime numbers since antiquity. Euclid famously 
proved that there exist infinitely many prime numbers; his ingenious proof still delights 
us today. Erathostenes developed his eponymous sieve algorithm for finding all primes 
up to some prescribed upper limit. They and the mathematicians who came after them 
continued to puzzle over the apparent erraticism with which prime numbers seem to be 
spread out among the natural numbers. For a long time, the only empirical observation 
anyone dared to make concerning the primes was that as we look at higher and higher 
numbers, primes seem to occur with a diminishing frequency. 

It was only in the late eighteenth century that mathematicians started making more 
quantitative guesses. Gauss observed privately in 1792 or 1793 (when he was around 16 
years old!) that the density of primes found around a certain integer n falls like the 
inverse of the logarithm of n; see the epigraphic quote above, and the historical survey 
[30]. This is easily seen to be equivalent to the statement that the number x(x) of prime 
numbers up to a given upper bound x behaves like Di as X — oo. Legendre, who 
was unaware of Gauss’s unpublished investigations, published an equivalent formula 
in 1808. This statement is now known as the prime number theorem. 


Theorem 2.1 (Prime number theorem). The prime-counting function 1(x) behaves asymp- 
totically as 


He asx > o. (2.1) 
gx 
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What is so striking about this result is that it takes a set of objects that appear to be 
the epitome of disorder, at least when inspected on a small scale, and in one clean, simple 
statement decrees that they nonetheless obey a very rigid law on the large scale. More- 
over, the connection to calculus in the form of the appearance of the natural logarithm 
function seems surprising in view of the existence of prime numbers as fundamentally 
discrete objects that do not appear to have any connection to the types of continuous 
phenomena calculus was developed to understand. 

Gauss’ conjecture, though bold and (as it turned out) correct, was ahead of its time; 
he and his contemporaries lacked the tools to make any significant progress on the prob- 
lem until several decades later. In fact, the entire field of complex analysis had yet to be 
invented, and it would turn out that that branch of mathematics is rather crucial to the 
methods involved in an eventual proof. Even when Riemann came up with some of the 
ideas that would turn out to be the most significant, in a famous paper he published 
in 1859—one of the most famous papers in the entire history of mathematics, titled On 
the Number of Primes Less Than a Given Magnitude—significant work still needed to be 
done, and several more decades would pass before the result became a proper theorem. 
This happened in 1896, when two proofs were published independently by Hadamard 
and de la Vallée Poussin. 

The work that led to the proof has become a cornerstone of what is now its own 
rich area of mathematics, known as analytic number theory. At the heart of this field 
is one of the greatest mathematical questions of all time, the still unsolved Riemann 
hypothesis, which can be thought of as being, in a rather precise sense, the “ultimate” 
version of the prime number theorem [46]. 

In this chapter, our ostensible goal is a proof of the prime number theorem, which 
in my opinion is the quintessential application of complex analysis.’ However, this is 
a case where the journey is no less interesting than the destination and will take us 
through a study of two special functions that play a crucial role in the proof: the Eu- 
ler gamma function and the Riemann zeta function. These functions are well worth 
learning about for their own sake, independently of the prime number theorem, and be- 
cause of their applicability to many other problems in pure and applied mathematics. 


2.2 The Euler gamma function 


The Euler gamma function (often referred to simply as the gamma function) is one 
of the most important special functions in mathematics. It has applications to many 
areas, such as combinatorics, number theory, differential equations, probability, and 


1 To be fair, so-called “elementary” proofs of the prime number theorem that avoid the use of complex 
analysis have been found, but this development came much later, required great effort and ingenuity, 
and many mathematicians seem to agree that these proofs are conceptually less appealing and fruitful 
for understanding the behavior of the prime numbers than the complex-analytic proofs. 
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more, and is probably the most ubiquitous transcendental function after the “elemen- 
tary” transcendental functions (the exponential function, logarithms, trigonometric 
functions, and their inverses) that we learn about in calculus. The gamma function is a 
natural meromorphic function of a complex variable that extends the factorial function 
to noninteger values. In complex analysis, it is particularly important in connection 
with the theory of the Mellin transform (a version of the Fourier transform associ- 
ated with the multiplicative group of positive real numbers in the same way that the 
ordinary Fourier transform is associated with the additive group of the real numbers). 

Most textbooks define the gamma function in one way and proceed to prove several 
other equivalent representations of it. I have always found that approach to be slightly 
misleading; the truth is that none of the representations of the gamma function is more 
fundamental or “natural” than the others. It seems more logical to me to present the 
topic by listing the various formulas and properties associated with the gamma function 
and then proving that that list adds up to a consistent whole, that is, that there exists a 
unique mathematical object satisfying them. 


Theorem 2.2 (Euler gamma function). There exists a unique function T of a complex vari- 
able s that has the following properties: 

1. I(s) isa meromorphic function on C. 

2. Connection to factorials: T(n +1) =n! for n = 0,1,2,.... 

3. Important special value: T(1/2) = V7. 

4. Integral representation: 


T(S) = | e*x>tdx (Res > 0). (2.2) 
0 


5. Infinite product representation: 


-1 


_ ts TI S s/n 
IT(s) = se (1+ z) e (s € ©), (2.3) 


n=1 


where y = lim,_,,.,(1 + i + i treet 1 - logn) = 0.577215 is the Euler-Mascheroni 
constant. 

6. Limit of finite products representation: 

S 


| 
T(s) = lim ae 


SRT Peay A (2.4) 


7. Zeros: the gamma function has no zeros (so T(S)" is an entire function). 
Poles: the gamma function has poles precisely at the nonpositive integers s = 
0,-1,-2,... and is holomorphic everywhere else. The pole at s = -n is a simple 
pole with residue 


Res,__,(T) = = (n=0,1,2,...). 
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9. Functional equation: 


I(s+1)=sI(s) (s€C). (2.5) 
10. Reflection formula: 
ierd-=92—— Gee (2.6) 
sin(zs) 


To begin the proofs, we do have to define the function we are claiming exists some- 
how, so we take formula (2.2) as our working definition of T(s). Fix a > 0. If s is in the 
half-plane {Re(s) > a > 0}, then 


o0 o0 0 o0 
| e™x dx] < | e*|x>"| dx = | e™ x1 dy < | e*x* 1 dx < 00. 
0 0 0 0 


Thus the improper integral (2.2) converges in the region Re(s) > 0 (uniformly on any 
half-plane Re(s) > a > 0) and therefore defines a function I'(s) which, by the result of 
Exercise 1.26, is holomorphic in that region. 

Next, perform an integration by parts, to get that for Re(s) > 0, we have 


y= 


o o 
T(s+1)= | e™x dx = -e ” x RE + | e *sx>1 dx = sT(s), 
0 0 


which is the functional equation (2.5). 
Combining the trivial evaluation ['(1) = J f e™ dx = 1 with the functional equation 
shows by induction that T(n + 1) = nl. 


Why is the gamma function shifted from the factorial by 1? 


The titular question above is a standard one that gets asked by many students introduced to the gamma 
function but is rarely discussed in print. If you assume that that the gamma function is a well-behaved 
extension of the factorial function to noninteger values is one of its most important properties, then the 
shifting of the value of the argument by 1 seems to make little sense, and the competing definition of a 
“factorial function” 

II(s) = I(s +1) 


would appear to be the more logical and natural one. In fact, historically, both definitions coexisted for 
some time, and the reasons why the notation I'(s) won the day and became established as the standard 
one are not entirely clear; this may be more of an accident of history than anything else. 

Nonetheless, there are indeed some good reasons to accept I'(s) as the more natural and sensible 
notational convention, at least in the context of complex-analytic applications (as opposed to, say, uses 
of the gamma function in combinatorics). See [W13] for further discussion of this issue. 
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The special value (1/2) = y7 follows immediately by a change of variable x = u? 


in the integral (2.2) and an appeal to the standard Gaussian integral f y e du = VT: 


T(1/2) = | eX V2 dx = | eo du = | e% du= V7. 
0 0 re] 


The functional equation (2.5), which so far we have only established in the region 
Re(s) > 0, where our working definition (2.2) is valid, can now be used to perform an 
analytic continuation ofT(s) to a meromorphic function on C. This is done in a series of 
steps: as the first step, define 


T;(s) = St 2 


which is a function that is holomorphic on Re(s) > —1, s + 0, and coincides with IT(s) for 
Re(s) > 0. By the principle of analytic continuation this provides a unique extension of 
T(s) to a meromorphic function in the region Re(s) > —1. Because of the factor 1/s and 
the fact that T(1) = 1, we also see that I’;(s) has a simple pole at s = 0 with residue 1. 
Next, for Re(s) > —2, we define 
Ty(s+1)  I(s+2) 


DS s s(s+1)’ 


a function that is holomorphic on Re(s) > —2,s + 0,-1, and coincides with I,(s) for 
Re(s) > —1, s + 0. Again, this provides an analytic continuation of T(s) to that region. 
The factors 1/s(s + 1) show that T,(s) has a simple pole at s = —1 with residue -1. 

Continuing by induction, having defined an analytic continuation [,,_;(s) of T(S) to 
the region Re(s) > -n+1,s + 0,-1,-2,...,-n + 2, we now define 


T,,_4(s + 1) Siete T(s+n) 


"a= O g(s+1)--(s+n-1) 


By inspection we see that this gives a meromorphic function in Re(s) > -n whose poles 
are precisely at s = -n + 1,...,0 and have the claimed residues. 

We constructed a sequence of meromorphic functions I,,(s) that are analytic con- 
tinuations of the original function I'(s) defined in (2.2) to a growing sequence of regions 
whose union is the entire complex plane. By packaging all these continuations into a sin- 
gle object we see that we have proved the existence of a unique meromorphic function 
on all of C that is an analytic continuation of the original I(s) and whose restriction to 
each of the half-planes Re(s) > -n coincides with the nth function T,(s) in the sequence. 
By a standard abuse of notation, we continue to denote this global analytically continued 
version of T(s) by I(s). 

As a partial summary, we established the existence and uniqueness ofT(s) as a func- 
tion of a complex variable satisfying properties 1, 2, 3, 4, 8, and 9 in Theorem 2.2. We now 
proceed with the proof of the remaining properties. 
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Lemma 2.3. For Re(s) > 0, we have 
n 
T(s) = dim, f(a = x) xs dx. (2.7) 
0 


Proof. The right-hand side of (2.7) can be rewritten as J oe (1- 5) Xion] (x)x>"! dx (where 
Xa denotes the characteristic function of a set). The integrand in this expression con- 
verges to e *x*! pointwise as n — oo. By the elementary inequality 1- t < e™% (t € R) 
we have 


Zor CPO™ Ces 0): 


(1- J Xion 00x5 


The claim therefore follows from the dominated convergence theorem. O 


Lemma 2.4. For Re(s) > 0, we have 


n n s 
(a 
n s(s + 1)---(S +n) 
0 
Proof. For n = 1, the claim is that 


1 
fa ~x)x1 dx = 
0 


1 
s(s +1)’ 


which is easy to verify directly. For the general claim, using a linear change of variables 
and integration by parts, we see that 


n 1 
(ass) xl dy = r {a-0 H't! at 
0 


0 
sji 
sefa- = 


1 1s 

-f na- tt | 
S 

0 


in a ty 1 ,(s+1) -1 dt, 
S 
0 


so the claim follows by induction on n. 


Combining the results of Lemmas 2.3 and 2.4, we obtain the “limit of finite prod- 
ucts” representation (2.4), except that we only proved it for Re(s) > 0. To establish it for 
general s, note first that 
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Ins E 
ey eo a 
s(s+1)---(s+n) 1 2 n 

<4 
= gigs zlog n) me S ) s/k 
= + k e > 
k=1 


which is an expression whose limit (if it exists) is the expression on the right-hand side 
of (2.3). This shows that representations (2.3) and (2.4) are equivalent, and from the dis- 
cussion above, both of them hold at least for Re(s) > 0. 

We now check that the infinite product on the right-hand side of (2.3)—or rather 
its reciprocal, corresponding to the entire function I(s)’, which is slightly more 
convenient—-satisfies the assumptions of Proposition 1.60 (with Q = C) and there- 
fore defines an entire function. Indeed, if K is a compact subset of C, then, for s € K, we 
have 


(Here the big-O notation hides a constant that depends on K but not on n.) 

Therefore the infinite product J [PS (1 + Sje n indeed defines an entire function, 
and relations (2.3) and (2.4) must hold for all s € C by the principle of analytic continu- 
ation. 

The last property that remains to be proved from the list of properties in Theo- 
rem 2.2 is the reflection formula (2.6). To prove this, we use the functional equation to 
transform the factor I(1 — s) as (—s)I'(—s) and then apply the infinite product formu- 
las (2.3) and (1.72) for the gamma and sine functions, respectively, to get that 


1 7 1 
I(E- s)  I(s)- (-s)T(-s) 


zl $ ses (2 + = Jes" s (=s)e 5 H(2 a = Jer 
S E n n 


n=1 


= 


> 


co s*\_sin(s) _ sin(zs) 
( n2 ) mn 
as claimed. 
An alternative method for proving (2.6) avoids the use of the infinite product formu- 
las. Assume that s is real and satisfies 0 < s < 1 (proving the identity for such s implies 
it for all s by analytic continuation). Then we have that 


(9I - s) = | et ‘T(s) dt = | ete | e™ (vei! ava 
0 0 0 
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foe} Cc Cc 
= | ey dy dt = | ( | em ar) dv 
0 d 


0 


1+e* 


v T e i x 
— dv= | dx (by setting v = e”). 


So the claim reduces to the definite integral evaluation 


9 e* 

= 0<s<1). 
| 1+e sin(7s) ( ) 
—co 


This definite integral appeared in Exercise 1.47 and can be evaluated in a straightfor- 
ward manner using contour integration techniques. 


Suggested exercises for Section 2.2. 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7. 


2.3 The Riemann zeta function: definition and basic properties 


The Riemann zeta function (often referred to simply as the zeta function when there 
is no risk of confusion), like the Euler gamma function is considered one of the most im- 
portant special functions in “higher” mathematics. However, the Riemann zeta function 
is a lot more mysterious than the gamma function and remains the subject of many fa- 
mous open problems, including the most famous of them all, the Riemann hypothesis, 
widely regarded as one of the most important open problem in mathematics today. 

The main reason for the importance of the zeta function is its connection with prime 
numbers and other concepts and quantities from number theory. Its study and in par- 
ticular the attempts to prove the Riemann hypothesis have also stimulated an unusually 
large number of important developments in many areas of mathematics. 

As with the gamma function, the Riemann zeta function is usually defined on only 
part of the complex plane, and its definition is then extended by analytic continuation, 
which can be done in many different ways. Again, this strikes me as in some sense “miss- 
ing the point” of the Riemann zeta function as a natural mathematical object that exists 
independently of which of the many formulas for it you choose as your definition. I will 
present the function in the form of a theorem summarizing its most important formulas 
and properties. 


Theorem 2.5 (Riemann zeta function). There exists a unique function, denoted ¢(s), of a 
complex variable s, having the following properties: 

1. ¢(s) is a meromorphic function on C. 

2. Series formula: for Re(s) > 1, ¢(s) is given by the series 


co 


1 1 1 
= —=1 — — vUe 2.8 
¢(s) È n5 ttz t (2.8) 
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3. Euler product formula: for Re(s) > 1, ¢(s) also has an infinite product representa- 
tion 


1 
¢(s) II p (2.9) 
where the product ranges over the prime numbers p = 2,3, 5,7,11,.... 
¢(s) has no zeros in the region Re(s) > 1. 
5. The “trivial” zeros: the zeros of ¢(s) in the region Re(s) < 0 are precisely at s = 
—2, 4,6, 0. 
6. Ç(s) has a unique pole, located at s = 1. It is a simple pole with residue 1. 
7. The “Basel problem” and its generalizations: the values of ¢(s) at even positive 
integers are given by Euler’s formula 


(CH2) 


En) = — n) 


Ba (n=1,2...), (2.10) 


where (Bm)m-o are the Bernoulli numbers, defined as the coefficients in the Taylor 
expansion 


foe) 
ay a 
mao 7 


Some of the properties of these remarkable numbers were discussed in Exercise 1.15. 
8. Values at negative integers: we have 


B 
Ceny=—-—. (n=1,2,3,...). 
n+1 


(Note that for negative even integers, this coincides with the property stated above 
about the trivial zeros ats = -2, —4, —6, . . ., since the Bernoulli numbers satisfy Bo, = 
0 for integer k > 1. However, this formula adds information about the values of ¢(s) 
at negative odd integers.) 

9. Functional equation: the zeta function satisfies 


¢"(1-s)=¢"(s) (se), (2.11) 
where we denote by (*(s) the symmetrized zeta function 
Pie ae 5). (2.12) 


An equivalent form for the functional equation is 


f(s) =25 n sin( Z = \ra- s)C(1-s). (2.13) 
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10. Integral representation: an expression for ¢(s) valid for alls € C is 


-s/2n( S = 1 1 1T 5i sd E 
: (3ko I-s oral +07 )(0(t) - 1) dt, (2.14) 


where 0(t) is the Jacobi theta function’ defined as 
o= Y ety em, (2.15) 


To begin the proof of Theorem 2.5, we take as the definition of ¢(s) the standard 
infinite series representation (2.8). Since X, |n] = Y,,n *®*, we see that the series 
converges absolutely precisely when Re(s) > 1 and that the convergence is uniform on 
any half-plane of the form Re(s) > a with a > 1. In particulary, it is uniform on compact 
subsets, so ¢(s) is holomorphic in this region. 

We now prove the Euler product formula (2.9). Intuitively, the remarkable identity 
between the infinite series (2.8) and the product (2.9) is often described as an analytic 
restatement of the fact that any positive integer has a unique factorization into primes. 
Indeed, observe that each of the factors r in the product can be expanded as a ge- 
ometric series in powers of p™%. Setting aside issues of convergence for a moment, the 
product can therefore be written as 


1 = x £ 1 
[| [0+ +p" E =. (2.16) 
p 1-p 7 : : n 

n=p! pk 


Pi»---Pg primes 


This last summation is in fact a sum over all positive integers n (with each n being 
summed over precisely once) by the fundamental theorem of arithmetic. So the sum 
is equal toy", $ = C(s). 

This calculation is appealing and memorable but lacking in rigor, since we have 
not said anything about the assumptions about s, nor justified our expansion of an infi- 
nite product of infinite series into a single infinite series. A fully rigorous (though more 
tedious) version of the same calculation proceeds as follows. Define the holomorphic 
function 


zs) = [J0 -0° 


Pp 


and note that this product converges absolutely if and only if the series J, lp *| = 
Èp p~ *° converges and in particular if Re(s) > 1. It follows that Z(s) is well-defined 


2 The same name is also used to refer to several other closely related functions; see Section 5.13, where 
some of those functions are discussed. 
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and nonzero for Re(s) > 1. We now prove that in this region, Z(s) = ¢(s). This can be 
done by manipulating the partial products associated with the infinite product defin- 
ing Z(s) in a similar vein to (2.16): if we denote by Çy(s) the product Ipen Dr (still a 
product over primes), then 


1 = Ss z 
wo) =|] == Gap +p” ep +). 
p<sN p p< 


= 


This is a product of a finite number of infinite series, each of them absolutely convergent 
in Re(s) > 1. By the standard fact from analysis that in such a product, the summands 
can be rearranged and summed in any order we desire, we see that the product can be 
expanded as 


1 
Dy. Ge 
n=pt -pK 


Pi»---Pg primes <N 


So we have represented ¢y(s) as a series of a similar form to (2.8) but involving terms of 
the form n™® only for those positive integers n whose prime factorization contains only 
primes < N. This set of integers in particular contains all the integers in [1, N]. It follows 
that 


KORSORD 


n>N 


Taking the limit as N — oo shows that Z(s) = liMy— o Gy(s) = ¢(s). This proves the 
validity of the Euler product formula. As a corollary, we also get that ¢(s) has no zeros 
in the region Re(s) > 1 (Property 4 in Theorem 2.5) since we already noted that Z(s) has 
this property. 

Next, we prove that ¢(s) can be analytically continued to a meromorphic function 
on C that has a pole at s = 1 and is holomorphic everywhere else. In the process of doing 
so, we will also obtain a proof of the functional equation (2.11). We will be aided by an 
important result from harmonic analysis, the Poisson summation formula. 


Theorem 2.6 (Poisson summation formula). Let f : R — C be differentiable infinitely 
many times, and assume that sup yep |x"f (x) < 00 for allk,n > 0.° Then 


$ f= } fw, (2.17) 
n=-00 k=-0o 


where 


3 A function satisfying these assumptions is called a Schwartz function. See Section A.6. 
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Fw = | foede uer) 


is the Fourier transform off. 


Proof. Define the function g : [0,1] — C by 
gix)= } f(x+n). (2.18) 
n=-00 


By the assumptions on f the series defining g(x) converges, and g is differentiable. Note 
that g(0) = g(1), so that g can also be interpreted as a periodic function on R or, equiv- 
alently, as a function on the circle R/Z; consequently, it is sometimes referred to as the 
“periodicization” of f. Now, since g is periodic and differentiable, a standard result from 
harmonic analysis [67, Thm. 2.1, p. 81] states that g(x) will have a pointwise convergent 
Fourier series of the form 


go = Y goe, (2.19) 
k=-00 
where g(k) are the Fourier coefficients of g given by 


1 
atk) = | gone *™™ ax. 
0 


In particular, the particular case x = 0 of (2.19) is the relation 
g(0)= ð &(k). (2.20) 
k=-00 


Moreover, the Fourier coefficient g(k) can be expressed in terms of the Fourier coeffi- 
cients of the original function f(x): 


1 1 
(k) = f gore ax £ Í 5 farne 2™ ax 


b Nn=—0o 


n+1 


1 
= 3 [ros nje dx = 3 | Fae du 
n==00 J n=-00 7 
= | fee" du =f). (2.21) 


Combining (2.20) and (2.21), we get that 
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g(0)= » fe, 
k=-00 


the quantity on the right-hand side of (2.17). On the other hand, setting x = 0 in (2.18) 
gives 


g(0)= > fm, 


so (2.17) follows. 


Theorem 2.7 (Functional equation for the Jacobi theta function). The Jacobi theta function 
O(t) satisfies the functional equation 


(>) = Vt@(t) (t>0). (2.22) 


We remark that equations of the form (2.22) and its variants are studied in the theory 
of modular forms, which is the subject of Chapter 5. Indeed, when we learn about this 
more general theory, we will see that 0(t) can be seen as belonging to a more general 
class of Jacobi theta functions, which are special functions with many applications in 
number theory and other areas of mathematics. See Section 5.13.1 and also Chapter 6. 


Proof of Theorem 2.7. Fix t > 0, and define the function f : R — R (depending on the 
parameter t) by 


foo =e™, (2.23) 


The function f clearly satisfies the assumptions of (2.6), so (2.17) holds. Note that the 
Fourier transform of f is given by 


ORLE (2.24) 


Indeed, for t = 1, it is the standard integral 


o 
| eX 9 2rixu ee emu (2.25) 
CO. 


(that is, the well-known fact that the function e-™ is its own Fourier transform; see Ex- 
ercise 1.47), and for general t > 0, this follows from (2.25) by a linear change of variables. 
Now substituting (2.23)—(2.24) into (2.17) immediately gives (2.22). 


An alternative method of proving (2.22) using purely complex-analytic arguments 
is discussed in Exercise 2.15. 
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Lemma 2.8. The asymptotic behavior of 0(t) near t = 0 and t = +00 is given by 


1 
et) = o() (t > 0+), (2.26) 
O(t)=14+0(e™) (t> %0). (2.27) 


Proof. The claim about the behavior of 0(t) as t — co is immediate from 


0 2 co Ie 
O@=1=2 eM st) a = ae 
n=1 z 


n=1 


which is bounded by 3e ™ if t > 1. This gives (2.27). Using (2.22) now gives that @(t) = 
101/0) = 1A + O(e°7/")) = Ot”) as t > 0+, which proves (2.26). 


We are now ready to prove that ¢(s) can be analytically continued to a meromor- 
phic function on C. This will be done by deriving representation (2.14) for Re(s) > 1 and 
showing that the expression on the right-hand side of (2.14) in fact defines a meromor- 
phic function on C. Start with the identity 


(jea 
0 


for Re(s) > 0. A linear change of variables x = mn’t brings this to the form 


oo 
Prha = | ete dt, (2.28) 


Summing the left-hand side over n = 1,2,... gives 7° REA (s)—the function we de- 
noted ¢*(s)—except that in order for this sum to converge, we now make the more re- 
strictive assumption that Re(s) > 1. Similarly, performing the same summation on the 
right-hand side of (2.28), we have that 


§ | emt- dt. = | Sem 62- dt = | A(t) - 1 s/2-1 dt. 
2 
0 0 0 


n=1 n=1 


Here we again assume that Re(s) > 1; by Lemma 2.8 this ensures that the integral in the 
last expression is absolutely convergent and therefore also, by the dominated conver- 
gence theorem, that it is permissible to interchange the order of summation and inte- 
gration as we did. 

Summarizing the above discussion, we have obtained the representation 


Ç*(s)= 


NIe 


feo —1)t°1 dt (Re(s) > 1) 
0 
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for the symmetrized zeta function ¢*(s) defined in (2.12). It is convenient to rewrite this 
as 


*(s) = foor dt (Re(s) > 1), 
0 


where we denote (t) = 5(0(t) — 1). Next, we use the functional equation (2.22) for A(t) 
to bring this integral to a new form, which is well-defined for all s € C except s = 0,1. 
More specifically, note that (2.22) can be expressed in the equivalent form 


g(t) = o/t) + F - 


We can therefore write, still assuming that Re(s) > 1, 


(s) = | ott dt + [oe dt 
1 


1 

| 

1 1 1 fo) 

f(a y “ou _ sem A foe” dt 

0 1 

: + la + tt) dt. (2.29) 
1 


This is representation (2.14). Now observe that since (t) = O(e-™) as t > co, the inte- 
gral f° (t09 + t°"1)9(t) dt satisfies the assumptions of Exercise 1.26 and therefore 
defines an entire function of s. Thus we have derived a formula for ¢*(s) that defines a 
meromorphic function on all of C, whose only poles are the simple poles at s = 0,1 (due 
to the two terms -1/s and 1/(s —1) in (2.29)). This concludes the proof that ¢(s) can be an- 
alytically continued to a meromorphic function on C. The functional equation (2.11) also 
follows trivially: simply observe that the representation we derived for ¢*(s) is mani- 
festly symmetric with respect to replacing each occurrence of s by 1- s. 

It is straightforward to verify that the two forms (2.11) and (2.13) of the functional 
equation are equivalent (Exercise 2.8). 

The claims from Theorem 2.5 that remain to be proved are properties 5-8. Property 7 
was proved in Chapter 1 as one of the consequences of the partial fraction expansion of 
the cotangent function (see Exercise 1.39). The remaining properties will now follow as 
a sequence of easy corollaries to the results we already proved. 


Corollary 2.9. The only pole of ¢(s) is a simple pole at s = 1 with residue 1. 


Proof. Our representation for Ç* (s) expresses it as a sum of — i, + and an entire func- 
tion. Thus the poles of ¢*(s) are simple poles at s = 0,1 with residues —1 and 1, respec- 


tively. It follows that 
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f(s) = wT (s/2) "7" (8) 


has a pole at s = 1 with residue 2!/"1(1/2)"! = 1 and a pole (that turns out to be a remov- 
able singularity) at s = 0 with residue mT (0)? = 0. (That is, the pole of ¢*(s) at s = 0 is 
canceled out by the zero of I'(s/2).) 


Corollary 2.10. ¢(-n) = -B,,,/(n+ 1) for n = 1,2,3,.... 


Proof: Let n > 1. Using version (2.13) of the functional equation, we have that 


Cn) =2- "2 "1 sin(-nn/2)T(n + 1)Z(n +1) 
=2"1 "4 sin(—nn/2)n!Z(n + 1). 
Ifn = 2k is even, then sin(-—7n/2) = 0, so we get that ¢(-2k) = 0 (that is, n = 2k is 
one of the so-called “trivial zeros”). We also know from Exercise 1.15 that B.,,, = 0 for 
k = 1,2,3,..., so the formula ¢(—n) = B,,,;/(n + 1) is satisfied in this case. 


If on the other hand n = 2k — 1 is odd, then sin(—7(2k — 1)/2) = ai and therefore 
using (2.10), we get that 


Çn) = (-1)12 +4 * (2k — 1)10(2k) 


= (4 kg-2k+1 -2k 2k — 1)! B 
(-1) m“ ) Zk)! 2k 
= Box 2 Bnn 
2k n+1 


so again the formula is satisfied. 


Corollary 2.11. The zeros of (s) in the region Re(s) < 0 are precisely the trivial zeros 
s =-2,-4,-6,.... 


Proof. We have already established the existence of the trivial zeros. We leave to you to 
verify that the fact that there are no other zeros follows immediately from the functional 
equation. 


Suggested exercises for Section 2.3. 2.8, 2.9, 2.10, 2.11, 2.12, 2.13, 2.14, 2.15, 2.16, 2.17, 2.18. 


2.4 Atheorem on the zeros of the Riemann zeta function 


Next, we prove a subtle and very important fact about the zeta function, which will play 
a crucial role in our proof of the prime number theorem. 


Theorem 2.12. ¢(s) has no zeros on the line Re(s) = 1. 


Proof. For this proof, denote s = o + it, where we assume that o > 1 and t is real and 
nonzero. The proof is based on investigating simultaneously the behavior of ((o + it), 
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¢(o + 2it), and ¢(o) for fixed t as ø N 1. Consider the following somewhat mysterious 
quantity: 
X = log| (oo + it)*C(o + 2it)). 
Using the Euler product formula (2.9), we can evaluate X as 


X = 3log|{(a)| + 4log|¢(o + it)| + log|f(o + 2it)| 


G 31og( I] 1 - pr") + Aog( I] [1 - eae) 


p prime p prime 
+ log( I] n-p) 
p prime 
= J (-3log|1 -p| - 4loglt - p "| - logit - p |) 
p prime 
= J (-3Re[Log(1 - p-’)] - 4Re[Log(1 - p-*")] 
p prime 


- Re Log[1 - p 7=™"]), 


where in the last expression, Log(-) denotes the principal branch of the logarithm func- 
tion. Now note that for z = a + ib with a > 1 and an arbitrary prime number p, we have 
|p] = p™® < 1, so by the Taylor expansion (1.93) of the Log(-) function, 


zy foe) p™ 
-Log(1- p 2 > 
and therefore 
-Re[Log(1 - p *)] = 5 aa Re[cos(mblog p) + isin(mb log p)] 
m=1 
= Sy p cos(mb log p). 
m 


This means that if we define quantities f, and c, for n > 1 by 


1 ifn = p™ for some prime p, 
EEN eo ifn=p some prime p 


0 otherwise, 


then we can rewrite X as 


X = X cpn” (3 +4 cos B, + cos(2B,)). 
n=1 
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We can now use the simple trigonometric identity 
3 + 4cos f + cos(2B) = 2(1 + cos By 


to rewrite X yet again as 
CO 
X=2) cn (1+ cos pp). 
n=1 


We have proved a crucial fact that X > 0 or, equivalently, that 
X 3 4 i 
e" = |C(a)°C(a + it) C(o + 2it)| 2 1. (2.30) 


We now claim that this innocent-looking inequality is incompatible with the existence 
of a zero of (s) on the line Re(s) = 1. Indeed, assume by contradiction that ((1 + it) = 0 
for some real t + 0. Then the three quantities ¢(a), ((o + it), and ¢(o + 2it) have the 
following asymptotic behavior as ø N 1: 


\C(a)| = . : i" O(1) (since ¢(s) has a pole at s = 1), 
Ičo + it)| = O(o - 1) (since C(s) has a zero at s = 1+ it), 
\C(a + 2it)| = O(1) (since ¢(s) is holomorphic at s = 1 + 2it). 


Combining these results, we have that 


e* = |C lo + itto + 2it)| = O((o - 1) °(o - 14) = 0l - 1). 


Thus e* > 0 as ø N 1, in contradiction to (2.30). This finishes the proof. 


2.5 Proof of the prime number theorem 


The prime number theorem (Theorem 2.1) was proved in 1896 by Jacques Hadamard 
and independently by Charles Jean de la Vallée Poussin using the groundbreaking ideas 
from Riemann’s famous 1859 paper, in which he introduced the use of the Riemann zeta 
function as a tool for counting prime numbers. The history of these developments is 
described in great detail (both historical and technical) in the book [25]. 

The original proofs of the prime number theorem were very complicated and relied 
on the “explicit formula of number theory” and some its variants (see the box on p. 109). 
Throughout the twentieth century, mathematicians worked hard to find simpler ways 
to derive the prime number theorem. This resulted in several important developments 
(such as the Wiener Tauberian theorem and the Hardy-Littlewood Tauberian theorem) 
that advanced not just the state of analytic number theory but also of complex analysis, 
harmonic analysis, and functional analysis. Despite all the efforts and the discovery of 
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several paths to a proof that were simpler than the original approach, all known proofs 
remained quite difficult. A minor breakthrough occurred in 1980 when Donald Newman 
discovered a surprisingly simple way to derive the theorem using relatively elementary 
complex-analytic arguments. The proof presented here is adapted from of a version of 
Newman’s proof due to Zagier [74]; see also [44, 49, 70]. 

Recall that the prime number theorem concerns the so-called prime-counting func- 
tion (x) defined as the number of primes that are less than or equal to x. It is helpful 
to write this in the form of a sum over primes, namely 


n(x) = #{p prime : p < x} = $ 1 


psx 


with the convention that the symbol p in summations always refers to primes. We also 
define the Chebyshev function y(x) as a closely related weighted sum 


log x 
W(X) = pe > vse ep 


In this definition, the first sum is over prime powers p* (with integer k > 1); the second 
sum is an alternative and trivially equivalent way of writing w(x) as a sum over primes 
rather than over prime powers. Another customary and equivalent way to write the 
function W(x) is as 


px) = $ AM), 


nsx 
where the function A(n), called the von Mangoldt function, is defined by 


logp ifn =p* with p prime,k > 1, 
A(n) = 
0 otherwise. 


Lemma 2.13. The prime number theorem n(x) ~ 


W(x) ~ 


Proof. The functions y(x) and z(x) can be related to each other in an approximate sense 
through two simple inequalities. First, observe that 


inex x İs equivalent to the statement that 


W(x) = Z rsp een |S 2p Igp = ¥ log x = logx - (x). (2.31) 
p<x 


Second, in the opposite direction, we have that for any 0 < e < land x > 2, 


Wo > Ylogp= Y logp=> $  log(x**) 


psx xh €<p<x xh €<p<x 
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= (1-€)log x(m(x) - m(x**)) > 1- e) log x((x) - x°). (2.32) 


Now assume that (x) ~ x as x > oo. Then (2.31) implies that (x) = be, and therefore 


lim inf mx) 


im inf eax 7 (2.33) 


On the other hand, (2.32) gives that x(x) < L . pe +x which then implies that 


lim su a(x) 1 ; logx 1 
X00 x/logx ~ 1-e X00 x 1-e 


Since e was an arbitrary number in (0, 1), it follows that 


: n(x) 
ae x/logx ~ 


(2.34) 


Combining (2.33) and (2.34) gives that z(x) ~ x/log x. This proves one of the two impli- 
cations claimed in the theorem. 


To prove the reverse implication, assume that 71(x) ~ and note that, by (2.31), 


=x. 
log x’ 


lim sup —— POO) < lim su nO) 


2.35 
x300 X x00 X/logx ~ (235) 


On the other hand, (2.32) implies that 


lim ing > lim int( HX) 8x) =1-e€. 
x>% ë xX x/logx xe 


Again, since e € (0,1) was arbitrary, it follows that lim inf,_,., e = 1. When combined 
with (2.35), we have shown that lim, 


y% _ 4 as claimed. 


>00 x 


A hint of the significance of the Chebyshev function and the equivalent form w(x) ~ 
x of the prime number theorem is offered by the next lemma. 


Lemma 2.14. For Re(s) > 1 we have 


-£8 aon Laon”. (2.36) 


Proof. Using the Euler product formula and taking the logarithmic derivative (which is 
an operation that works as it should when applied to infinite products of holomorphic 
functions that are uniformly convergent on compact subsets), we have 


¢'(s) 4a-p™) logp -p~ 
Çs) o 1ps er 
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logp(p*+p +p *+--)= X Ylogp-p™ 
p prime k=1 


Amn. 


Ms =M 


= 
Il 
= 


At this point in the discussion, we can already outline a plausible-sounding heuris- 
tic explanation for why the prime number theorem might be true. Consider the two se- 
quences a, = A(n) and b, = 1. By Lemma 2.13 the prime number theorem is equivalent 
to the claim that 

Sia Sh as X > 00, (2.37) 
x nsx x nsx 
that is, that the sequences a,, and b, exhibit similar average asymptotic behavior. On the 
other hand, if we are willing to be a bit more flexible about interpreting what we mean 
by “average”, that is, replacing the straightforward arithmetic averages by a certain class 
of weighted averages, then there is a statement of this type that is easily seen to be true, 
namely, the statement that 


: oe z Ș Pe aso N1. (2.38) 
co) an Klo) An? 
Indeed, the right-hand side of this relation is equal to 1, and the left-hand side is, by (2.36), 
equal to ONE) which converges to 1as ø \ 1 due to the fact that both the numerator 
and the denominator in this fraction have a simple pole with residue 1 at ø = 1. 

The above argument raises the question of whether this heuristic explanation can 
be turned into a proof. That is, is it generally true that an asymptotic equivalence of the 
form (2.38) can be used to deduce the more natural equivalence (2.37)? Or, if it is not true 
in unrestricted generality, what additional assumptions are needed to make such a de- 
duction correct, and are these assumptions satisfied for our particular case of interest? 
The general area in which such questions belong is that of Tauberian theorems (a name 
honoring an 1897 result of the mathematician Alfred Tauber, who proved an important 
early result of this type). These questions turn out to be quite delicate, and although this 
approach does in fact offer a viable route toward a proof of the prime number theorem 
(see [47, p.261]), following this route requires rather involved ideas from Fourier anal- 
ysis. Here we take a slightly different path that, although also in line with the general 
philosophy of Tauberian theorems, starts by further reducing the problem into that of 
showing the convergence of a certain improper integral. The following lemma gives the 
details of this simple reduction. 


Lemma 2.15. Assume that the improper integral 


o0 


yn) _,) ae 
( ) (2.39) 


X X 


converges. Then the prime number theorem follows. 
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Proof. Keeping in mind Lemma 2.13, we will prove the contrapositive claim that if 


does not converge to 1 as x — oo, then the integral (2.39) cannot converge. 


Assume that 1 -+ 1. In this scenario, either L, = limsup,_,., a >1,orL_ = 


lim infoo a < 1. In the first case, observe that there are arbitrarily large values of x 


for which po > 1+2e, where we denote € = ht > 0. For a value of x with that property, 


using the fact that W(x) is weakly monotone increasing, we see that 


(1+e)x (1+e)x 2 
YO _ jZ (g dt a e E 
l ( = ye | (+ 6@)x 1) a =) á 


Thus we have shown that the function I(T) = pes -1) x has infinitely many intervals 
over which it changes value by at least the fixed positive constant C, which implies that 
the improper integral (2.39) cannot converge. 

Similarly, in the second case in which L_ < 1, we again note that there are arbitrarily 
large values of x for which #09 < 1- 2e, where e is defined as the constant e€ = Ti 
(which is positive and trivially bounded from above by 1/4). For such x, again from the 


monotonicity of y(x) we get that 


x x 2 
j (Mts | ete) 
des t t aos (1-)x (1-)x 1-e 


This is again inconsistent with the possibility that the integral (2.39) converges. 


One additional ingredient of our proof is the following elementary bound on the 
Chebyshev function. 


Lemma 2.16. There is a constant C > 0 such that W(x) < Cx for all x > 1. 


Proof. The idea of the proof is that the binomial coefficient (2) is not too large on the 
one hand butis divisible by many primes (at least all primes between n+1 and 2n) on the 
other hand; hence it follows that there cannot be too many primes, and in particular the 
weighted prime-counting function W(x) can be easily bounded from above using such 
an argument. More precisely, we have that 


maa- Ë E)E) I p-e, X e) 


k=0 n<ps2n n<ps2n 
= exp( pan) -y(n)- } lg p); (2.40) 
n<pk<2n, k>2 
The sum in the last expression is easily bounded as 
$ logp < 10Viilog’*n+10 (n>1) (2.41) 


n<pk<2n, k>2 
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(Exercise 2.19). Thus taking the logarithm of the first and last expressions in (2.40), we 
get the bound 


(2n) - y(n) < 2nlog2 +10 Vnlogn +10 < on 


for all n > 1 with some constant C > 0. For n of the form n = 2”, m > 0, this allows us to 
write 


thereby establishing the inequality W(x) < icx for any x that is a power of 2. Finally, for 
a general integer x > 1, we can represent x as x = 2" + £ for some m > Oand 0 < £ < 2™. 
We then observe that 


Y) = Y(2™ + 2) < w(2™") < C2" < Cx, 


which is the desired bound. 


We are ready to state a Tauberian theorem, which in some sense forms the heart of 
the proof of the prime number theorem. 


Theorem 2.17 (Newman’s Tauberian theorem). Let f : [1,c0) — R be a bounded function 
that is integrable on compact intervals. Define a function g(s) of a complex variable s by 


o0 


g(s) = | foo dx. (2.42) 
1 


Clearly, g(s) is defined and holomorphic in the open half-plane Re(s) > 0. Assume that 
g(s) has an analytic continuation to an open region Q containing the closed half-plane 
Re(s) > 0. Then the improper integral 


| E dx (2.43) 


1 


converges, and its value is equal to g(0), the value at s = 0 of the analytic continuation 
of g. 
Before we proceed with the proof, it is worth pausing to appreciate the subtlety 


of this result. The conclusion of the theorem about the existence of the improper inte- 
gral (2.43) can be expressed as the statement that 
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T 


lim [2 dx = lim | dx. 
1 


T—>œ X ex0 


This sort of equivalence of limits seems to fall readily within the realm of real analysis. It 
is remarkable that the condition needed for this conclusion to hold is a complex-analytic 
condition involving the existence of an analytic continuation for the function g(s) (and, 
moreover, to a region that contains parts that extend arbitrarily far from the real axis). 
If you were not already convinced of the importance and relevance of complex analysis 
to the rest of mathematics, I hope this will make you rethink your skepticism! 


Proof of Theorem 2.17. Define a truncated version of the integral defining g(s), namely 


T: 


gr(s) = [roox dx 


1 


for T > 1. We claim that g;(s) is an entire function of s for any fixed T. This can be proved 
using Morera’s theorem: let y be a closed contour in C. Then 


T 


pero ds = ? | fooxs dxds = if fooxs ds dx = | 0 dx =0. 


1 


In the above calculation, interchanging the order of the two integrals is justified by Fu- 
bini’s theorem, which (as we can easily check) is applicable in the current situation. 
Since the integral of gr(s) over an arbitrary closed contour y vanishes, gy is entire by 
Morera’s theorem. 

Now our goal is to show that limo g7(0) = g(0). This will be achieved through 
an application of Cauchy’s integral formula. Fix some large number R > 0 and a small 
number 6 > 0 (which depends on R in a way that will be explained shortly), and consider 
the contour C consisting of the part of the circle |s| = R that lies in the half-plane Re(s) > 
—6 together with the straight line segment along the line Re(s) = -6 connecting the top 
and bottom intersection points of this circle with the line (see Fig. 2.1(a)). Assume that 6 is 
small enough so that g(s) (which by the assumptions of the theorem extends analytically 
at least slightly to the left of Re(s) = 0) is holomorphic in an open set containing C and 
the region enclosed by it. Then by Cauchy’s integral formula the difference g(0) — gr (0) 
can be expressed as 


1 B 2 \d 
(0) -8r = 5 (gs) -ror a 5). 24D 
(0 


Note that this equation would still hold true if the integrand on the right-hand side were 


the simpler expression £ Grete) ; however, Newman’s inspired observation was that the 


inclusion of the additional factors T*(1+ 5) actually helps by producing an integral that 


106 —— 2 The prime number theorem 


(a) (b) (c) 


Figure 2.1: The contours C, A, B, and B’. 


can be estimated effectively (while keeping the value of the integral the same). To see 
how this works, start by separating the contour C into two parts, a semicircular arc A 
that lies in the half-plane Re(s) > 0 and the remaining part B in the half-plane Re(s) < 0 
(Fig. 2.1(b)). We can then write 


g0) -g7(0) = h +h, (2.45) 
where 
1 sê \ ds 
n= zg (BC &r(s))T ‘(1+ a (2.46) 
1 s \ ds 
h= zi |69- &r(s))T ‘(1+ 5E (2.47) 


We now bound J, and I, separately. Denote 
M = supl|f(t)| 
t>1 
(and recall the assumption that this number is finite). For s with Re(s) > 0, we are in the 


region where formula (2.42) is valid, so we can bound the expression g(s) — g7(s) as 


T 


fox dx - | f(xx st a 


1 


|gs) - 8r(9)| = 


g =; 


pa — Re(s) 
<M |x 1] dx = a (2.48) 
Re(s) 


T 


fox dx 


J 


Note also that for s satisfying |s| = R, we have that 
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s (2 s ) 
. + = 
R S R 
- RO 2 | Re(s)| 
aa 


a pels) 


T° |(S/R + s/R)| 


2 
r(1+ 2 ) 
R2 


(2.49) 


The bounds (2.48)-(2.49) both apply on the subcontour A, so by combining them we get 
that 


ral <5 = (mR) a ed (2.50) 
R 
Next, we bound I, by bounding the contributions from g(s) and g(s) separately, that is, 


further decomposing that integral as 


2 
Í= = [gor(1+ 2). aq | eror'(1+ 5) S - Spies “25d 
B 


271 
B 


In the case of J, since g-(s) is an entire function and the only singularity of the integrand 
is at s = 0, we can deform the integration contour B replacing it with the semicircular 
arc B’ = {s : |s| = R, Re(s) < 0} (Fig. 2.1(c)). By Cauchy’s theorem the value of the integral 
remains the same. On the new contour B’ the bound (2.49) holds, and there we also have 
the estimate 


T 


| fox dx 


1 


T ure) 


<M [bo] dx = =. 
Ji m= Tre 


\gr(s)| = 


Therefore, similarly to (2.50), we have the bound 


his sR) = E. (2.52) 
R 
The remaining integral J, tends to 0 as T — oo (with R fixed), since the dependence 
on T is only through the factor T°, which converges to 0 uniformly on compact sets in 
Re(s) < Oas T > oo. 
Combining this last observation with (2.45), (2.50), (2.51), and (2.52), we have there- 
fore shown that 


; 2M 
lim sup|g(0) - g7(0)| < =a 
T= 


Since R was an arbitrary positive number, the lim sup must be 0, and the theorem is 
proved. 


Consider now the following application of Theorem 2.17 to a specific function: take 


fo) = 9 4 wen 
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as our function f(x). Note that f(x) is bounded by Lemma 2.16. The associated function 
g(s) is then 


g(s) = (2 a ae dx 


pox? ax - J 5 aw )x arcs 


nsx 


fee) “ x s- 1 
-$a p - saban A == 
n=1 n s 
FS 1 č(s+1) 1 
Sii È A(n)n > Een (Re(s) > 0) 


by (2.36). Recall that —¢' (s)/č (s) has a simple pole at s = 1 with residue 1 (because ¢(s) has 
a simple pole ats = 1; itis useful to remember the more general fact that if a holomorphic 
function h(z) has a zero of order k at Z = Zo, then the logarithmic derivative h'(z)/h(z) 


has a simple pole at Z = Z) with residue k). So — a vee} has a simple pole with residue 
1at s = 0, and therefore -—L . sie 1 has a removable singularity at s = 0. Thus the 
s+1  Ç(s+1) 
1 Ç'(s+1) 


identity g(s) = -z3 ` ZGD T > shows ae g(s) extends analytically to a holomorphic 


function in the region 
{se C : (s +1) #0}. 


By Theorem 2.12, g(s) in particular extends holomorphically to an open set containing 
the half-plane Re(s) > 0. 

We have therefore shown that f (x) satisfies the assumption of Newman’s Tabuerian 
theorem. We conclude from the theorem that the improper integral 


poe le 1\% 


converges. By Lemma 2.15 the prime number theorem follows. 


Suggested exercises for Section 2.5. 2.19, 2.20, 2.21. 
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The explicit formulae of number theory and the Riemann 
hypothesis 


The proof of the prime number theorem presented in this chapter made crucial use of the fact that ¢(s) 
has no zeros on the line Re(s) = 1, but when following this approach, the connection between those two 
facts seems somewhat opaque and mysterious. 

Another, more advanced, approach to the prime number theorem that draws a clearer conceptual 
line between the location of the zeros of ¢(s) and the validity of the asymptotic formula (x) ~ x is based 
on the so-called “explicit formulae of number theory.” This is the name given to a family of identities, the 
simplest of which being 

W(x) =x- > x —log(27) (x >1,x noninteger). (2.53) 

P 

In this formula the sum on the right-hand side ranges over all zeros p of the Riemann zeta function 
counted with their respective multiplicities. (In most textbooks the sum is separated into two sums, one 
ranging over the trivial zeros, which can be evaluated explicitly, and the other ranging over the zeros in 
the strip 0 < Re(s) < 1. Also, the sum is only conditionally convergent; refer to [47, p. 397] for the proper 
way to interpret it to get a convergent sum.) Note that this is an exact identity, not an asymptotic result. 
To convert it to an asymptotic result, the key observation is that each of the power terms x? has magni- 
tude x®*) Thus, knowing that Re(p) < 1 suggests that the term x? is of a smaller order of magnitude 
than the “principal” term x and therefore plays a negligible role in the asymptotic behavior of p(x). This 
leads directly to the asymptotic formula W(x) ~ x. (Note that this argument is incomplete, since there 
are infinitely many zeros, so we would be dropping infinitely many of these terms, which requires further 
justification.) 

The same type of reasoning involving (2.53) also suggests that even if we had more precise bounds 
on the real parts of the zeros of ¢(s), we could prove quantitative versions of the prime number theorem 
with explicit error bounds. The strongest statement of this type that is believed to hold is the celebrated 
Riemann hypothesis. 


Conjecture 2.18 (The Riemann hypothesis). All the nontrivial zeros of ¢(s) are on the “critical line” Re(s) = 
1/2. 


For more details, see [25, 46, 47] and [W14]. 
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Exercises for Chapter 2 


2.1 Prove the following properties satisfied by the Euler gamma function: 
(a) Values at half-integers: 


1\_ (2n)! E 
(n+ 5) = ove (n = 0,1, 2,...). 


(b) The duplication formula: 
T(s)F'(s + 1/2) = 2" 75 Va (2s). 


(c) The multiplication theorem: for any k > 1, 


ror(s + z)F(s + =) T r(s + —) = (2m) ŒD KYST (Ks), 


2.2 Prove the following representation for the gamma function: 
co 


= S (-1)" =x „sS-1 
= aa * | x dx (séC). 


2.3 Forn > 1, let V, denote the volume of the unit ball in R”. By evaluating the 
n-dimensional integral 


in two ways, prove the well-known formula 


(hl? 


Va =—-—.- 
” +1 


Note. This problem requires applying a small amount of geometric intuition (or, 
alternatively, having some technical knowledge of spherical coordinates in R”). For 
the solution, see [W15]. 

2.4 The beta function is a function B(s, t) of two complex variables, defined for 
Re(s), Re(t) > 0 by 


1 
B(s, t) = | x = xy dx. 
0 


(a) Show that the improper integral defining B(s, t) converges absolutely if and 
only if Re(s), Re(t) > 0. 
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(b) Show that B(s, t) can be expressed in terms of the gamma function as 


T(s)I(t) 
T(s+t)’ 


Bis, t) = 


Guidance. Start by writing I'(s)I'(t) as a double integral on the positive quad- 
rant [0,co)* of R? (with integration variables, say, x and y); then make the 
change of variables u = x + y, v = x/(x + y) and use the change-of-variables 
formula for two-dimensional integrals to show that the integral evaluates as 
T(s + t)B(s, t). 

2.5 The digamma function y(s) is the logarithmic derivative 


of the gamma function, also considered as a somewhat important special function 
in its own right. 
(a) Show that w(s) has the convergent series expansions 


w(s) = -y T9) 


s 
m n(n +s) 


z 1 1 
= 0, -1,-2,...), 
oder >) i ) 


where y is the Euler-Mascheroni constant. 
(b) Equivalently, show that (s) can be expressed as 


; m 1 
Ws) = - jim ( > ee -ign ) 
(c) Show that y(s) satisfies the functional equation 


W(s +1) = W(s) + . (s # 0, -1,-2,...). 


(d) Show that 


n 


1 
y(n +1) R (n=0,1,2,...). 


That is, W(x) + y can be thought of as extending the definition of the harmonic 
numbers H, = Xk- i to noninteger arguments. 
(e) Show that y(s) satisfies the reflection formula 


wW(1—s) - W(s) = 7 cot(rs). 
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(f) Here is a curious application of the digamma function. Consider the sequence 
of polynomials 


P a(x) = x(x -1)...(x-n) (n=0,1,2,...) 
and their derivatives 
Qn (x) = Pi). 


By Rolle’s theorem, Q,,(x) has precisely one root in each interval (k, k + 1) for 
0 < k < n-1. Denote this root by k+a, x, so that the numbers a, x (the fractional 
parts of the roots of Q,,(x)) are in (0, 1). 

A curious phenomenon can now be observed by plotting the points anp, K = 
0,...,n — 1, numerically, say for n = 50. You will see that for large n, the plot 
appears to approximate a smooth limiting curve. The following precise state- 
ment can be proved. 


Theorem 2.19 ([56]). Let t € (0,1). Let k = k(n) be a sequence such that 0 < 
k(n) < n-1, k(n) — coasn > œ, n - k(n) —> œ as n —> oo, and k(n)/n > t 
as n —> oo. Then we have 


li 1 1 l 1-t 
a On k(n) 5 = arccot = og R . 


In the above formula, arccot(-) refers to the branch of the inverse cotangent 
function taking values between 0 and z. 


Prove Theorem 2.19 using the facts you learned about the digamma function. 
2.6 Given two integrable functions f,g : R — C ofa real variable, their convolution 
is the function h = f « g defined by the formula 


h(x) = (f * g)(x) = | fOr- tdt (eR. 


The convolution operation is extremely important in harmonic analysis, since it 
corresponds to a simple multiplication operation in the Fourier domain; in prob- 
ability theory, where it corresponds to the addition of independent random vari- 
ables; and in many other areas of mathematics, science, and engineering. 
For a > 0, define the gamma density with parameter a, denoted y, : R —> R, as 

Vo(X) = L œX) (x € R) 

Tr(a) à 

(where y, denotes the characteristic function of a set A c R). Note that y,(x) is 
a nonnegative function whose integral equals 1, so that it is a probability density 
function. 


2.7 


2.8 


2.9 


2.10 


2.11 
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Show that for all a, $ > 0, we have 


Va * YB = Varp 


that is, the family of density functions (Ya)a>o is closed under the convolution op- 
eration. This fact is one of the reasons why the family of gamma densities plays an 
important role in probability theory and appears in many real-life applications. 
Show that the initial terms in the Laurent expansion of I'(s) around s = 0 are of the 
form 


where y is the Euler—-Mascheroni constant. 

Prove the equivalence of the two versions (2.11) and (2.13) of the functional equation 
for the Riemann zeta function. 

Show that the initial terms in the Laurent expansion of ¢(s) around s = 1 are of the 
form 


¢(s) = <4 ty+ 0-9). 


Define the function n(s) of a complex variable s by 


co n-1 
es ce Ie 
MO= Se ge 


This function, a close cousin of the Riemann zeta function, is known as the Dirichlet 

eta function. 

(a) Prove that the series defining n(s) converges uniformly on any half-plane of 
the form Re(s) > a withe a > 0, and conclude that n(s) is defined and holo- 
morphic in the half-plane Re(s) > 0. 

(b) Show that n(s) is related to the Riemann zeta function by the formula 


n(s) = (14-2"S)¢(s) (Re(s) > 1). 


(c) Using this relation, deduce a new proof that the zeta function can be analyti- 
cally continued to a meromorphic function on Re(s) > 0 that has a simple pole 
at s = 1 with residue 1 and is holomorphic everywhere else in the region. 

Now that you have learned about the Riemann zeta function and its properties, go 

back and look at identities (1.54)—(1.55). Can you make sense of what these formulas 

claim? How do they relate to ¢(s) and to the Dirichlet eta function n(s) discussed in 

Exercise 2.10? 
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2.13 


2.14 
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r (s) (discussed in 


Show that the Taylor expansion of the digamma function y(s) = Ts) 


Exercise 2.5) around s = 1is given by 
w(s)=-y+ $ (D n+ Is-9" (Is-1 <1), 
n=1 


where y is the Euler-Mascheroni constant. 
(a) Prove that for all x > 1, 


I] : > log x 


(where the product is over all prime numbers p < x). 
(b) Pass to the logarithm and deduce that for some constant K > 0, we have the 
bound 


$ A >loglogx-K (x21). 


psx 


(It is also possible to show a matching upper bound of log log x + K’ for some 
constant K’ > 0, thatis, the harmonic series of primes Xp 5 diverges as log log x, 
in contrast to the usual harmonic series, which diverges as log x.) 
Riemann’s contour integral representation for ((s). Prove another expression 
for ¢(s) valid for all s € C: 


_Td-s) Í (-x)* dx 


27i ex-1 x’ 
c 


¢(s) (2.54) 


where C is a keyhole contour coming from +co to 0 slightly above the positive 

x-axis, then circling the origin in a counterclockwise direction around a circle of 

small radius, and then going back to +oo slightly below the positive x-axis. 

Note. Representation (2.54) is due to Riemann, who used it in his famous 1859 pa- 

per for his first proof of the analytic continuation and functional equation for his 

eponymous zeta function. In the same paper, he proceeded to give a second proof 

using the method described in Section 2.3. See [25, Ch. 1] for more details. 

An alternative proof of the functional equation of the Jacobi theta function. 

(a) Recall the definition of the Jacobi theta function @(t) in (2.15). Use the residue 
theorem to evaluate the contour integral 


—nz't 
e 
——_ dz, 
e2miz —1 
Yn 


where yy is the rectangle with vertices +(N +1/2) +i (with N a positive integer), 
then take the limit as N — oo to derive the integral representation 
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oo. et emt 
Q(t) = | sau | a t>o) (2.55) 


i -ooti 


co+i 


for @(t). 
(b) Inrepresentation (2.55), expand the factor ("4 asa geometric series in 
e 22 (for the first integral) and as a geometric series in eZ (for the second 
integral). Evaluate the resulting infinite series, rigorously justifying all steps, 
to obtain an alternative proof of the functional equation (2.22). 
2.16 Define the following arithmetic functions taking an integer argument n: 


d(n) = $ 1 (the number of divisors function), 
dįn 

a(n) = $ d (the sum of divisors function), 
d\n 


o(n) = #{1<k<n-1: ged(k,n) =1} 
(the Euler totient function), 


logp ifn =p“, p prime, 
me Io i pass ° 
(the von Mangoldt A-function), 

‘igi a ifn = pP» -+ Pg is a product of k distinct primes, 
0 otherwise, 

(the Möbius u-function), 
A(n) =(-1* ifn = PıP2 +- Pr is a product of k primes, 

(the Liouville A-function). 


We saw that the zeta function and its logarithmic derivative have the series repre- 


sentations 


-s ¢"(s) -s 
= > = = $ 
C(s) 2i TO) AA 


Both these series are of the general form 


for some sequence (c,);,. A series of this type is called a Dirichlet series. 
Prove the following additional identities (valid for Re(s) > 1) expressing various 
functions related to ¢(s) as Dirichlet series: 
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2.18 
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-¢'(s) = È logn -n™”, 
n=1 


Tag = 
=~ =) unr, 
¢(s) 2 
Ç(s) S -s 

son = Lunn”, 
¢(2s) 2I | 
((2s)_ © -s 
7— = ) A(nn”, 
¢(s) 2 


G(s)" = È dan)n™, 
n=1 


¢(s-1)_ <= -s 
= njn”, 
ZS 2A ) 
EOE -D = Y on™. 
n=1 
Evaluate the following infinite products: 
2 
(a) rer oo os as =t 
p+l 5 1 26 50 s 
(b) Tp prime p 4 9°25 49° °°" ? 


(Compare with the products in Exercise 1.42.) 

Show that the infinite product K := [Jp prime = whose value you computed in 
Exercise 2.17 can be given the following geometric interpretation as “the fraction 
of lattice points in Z? visible from the origin.” That is, assume that you are standing 
at the origin point (0, 0) of an infinite grove of trees, positioned at the lattice points 
(m,n) € Z? \ {(0, 0)}. These are idealized trees that have zero thickness, so you will 
be able to see the tree at (m, n) from your vantage point if and only if there is no 
other tree obscuring the view from some position (m/k, n/k), where k is a common 
divisor of m and n, that is, if and only if m and n are relatively prime. 

Define 


_ #{(m,n) e Z? \ {(0,0)} : Im], In| < N, m, n are relatively prime} 
#{(m,n) € Z? \ {(0,0)} : Iml, In| < N} 


Ky 


for N > 1. Prove that K, — K as N — oo. This gives a precise asymptotic meaning 
to the above informal description of K as the fraction of lattice points visible from 
the origin. 

Prove the bound (2.41). 

Let p, denote the nth prime number. Prove that the prime number theorem is 
equivalent to the statement that 


Pn ~nlogn asn -> oo. 
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2.21 Define a sequence of numbers (8(n));°, by 
p(n) = lcm(1, 2,...,n), 


where for integers a4, .. ., Ag, lom(a,,...,@,) denotes the least common multiple of 

a1»... Ag. This natural number-theoretic sequence of integers [W16] has the num- 

bers 1, 2, 6, 12, 60, 60, 420, 840, 2 520, 2 520, 27 720 as its first few values. 

(a) Prove that B(n) = exp(Ņ(n)), where W(x) = }px<xl0g p denotes Chebyshev’s 
weighted prime counting function. 

(b) Conclude using the equivalent formulation of the prime number theorem in 
terms of Chebyshev’s function that 


pin) =e asn > o. 


3 Conformal mapping 


Second Hypothesis: That small regions of the Earth should be displayed as similar figures in the 
plane. 


Leonhard Euler, “On the mapping of spherical surfaces onto the plane” (1777) 


3.1 Motivation: classifying complex regions up to conformal 
equivalence 


As we discussed in Chapter 1, the notion of a conformal mapping is a highly appealing 
geometric idea that can be explained to anyone without any requirement that they ever 
heard of complex analysis, let alone understand any of the mathematics underlying it. 
Anyone who can appreciate the art of M. C. Escher (see Fig. 1.2 on p. 8) will intuitively 
grasp that there is something special and beautiful about conformal maps. 

Conformal maps are also an important tool in the toolkit of applied mathematicians. 
They have many applications for solving important partial differential equations that 
show up in physics, engineering, and in other areas as diverse as cartography [68] and 
medical imaging [37]. 

In this chapter, we will approach the area of conformal mapping from a purely 
complex-analytic direction. We will see that this side of the theory has a beauty all 
its own, which, while subtle and requiring patience and contemplation to appreciate, 
equals and perhaps surpasses the more obvious aspects appreciated by art lovers and 
equation solvers. 

Let Q c C be a complex region. In complex analysis, we often wish to understand 
the classes of functions H(Q) and M(Q) of holomorphic and meromorphic functions on 
Q, respectively. You might think that the structures of these classes of functions would 
depend in some highly sensitive way on the particular choice of the region Q. As it turns 
out, this is largely untrue: although the structure of such a family does vary somewhat, 
there are large families of regions Q for which the structure of H(Q) (respectively, M(Q)) 
is the same across all members of a given family, so that it is in practice enough to un- 
derstand what is happening in one representative region of each family. Moreover, the 
question of which family a particular region Q belongs to can in many cases be answered 
using topological properties of Q. 

To make this idea precise, we define an equivalence relation on regions that cap- 
tures the notion that for two regions Q and Q’, H(Q) and 1(Q’) “have the same struc- 
ture.” This relation is called biholomorphism or conformal equivalence. We say that 
Qand Q’ are conformally equivalent if there is a bijective holomorphic map g : Q — Q! 
whose inverse is also holomorphic. Such a map g is called a biholomorphism, biholo- 
morphic map, or conformal map. Note that a conformal map must satisfy g’(z) + 0 for 
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any Z € Q, by Corollary 1.58. It is trivial to check that the relation of conformal equiva- 
lence is, as its name suggests, an equivalence relation.’ 

If Q and Q' are conformally equivalent and related by a conformal map g : Q > Q’, 
then each holomorphic function (respectively, meromorphic function) f : Q — C can 
be used to define a holomorphic (respectively, meromorphic) function f : Q’ > C by 


fafeg". 


It is immediate to check that the correspondence f + f defines a bijection between 
H(Q) and H(Q’) (respectively, between M(Q) and M(Q')). Thus the conformal map 
allows us to translate any question about holomorphic or meromorphic functions on Q’ 
to a question about holomorphic or meromorphic functions on Q. The definition of 
conformal equivalence therefore captures precisely the notion of equivalence we were 
interested in. 

In many areas of mathematics, when we find an interesting equivalence relation, 
this immediately leads to a standard set of interesting questions: how do we determine 
equivalence? Can we describe all equivalence classes, or at least some particularly sim- 
ple or important ones? Do there exist some canonical representatives in each of those 
equivalence classes? How can we construct a map demonstrating equivalence, and to 
what extent is it unique? And so on. Asking such questions for this particular equiva- 
lence relation turns out to be very fruitful and is what the area of conformal mapping 
is about. 


Examples. Here are some regions that seem worth thinking about from the point of 
view of conformal mapping, both theoretically and because they arise in applications 
(for example, in the study of Laplace’s equation in mathematical physics, electrostatics, 
hydrodynamics, etc): 

1. the complex plane C 

the punctured plane C \ {0} 

the unit disc D = {z € C : |z| < 1} 

the upper half-plane H = {z € C : Im(z) > 0} 

the Riemann sphere? C = Cu {oo} 


Sa so BS 


1 In this chapter, we use the term “conformal map” with a slightly different meaning than the sense in 
which this term was used in Subsection 1.3.4. That subsection was concerned with understanding the 
property of being conformal as a local property; here we develop the conceptually much richer set of 
ideas related to understanding maps that are globally conformal—that is, conformal everywhere in the 
local sense but also bijective. Moreover, the conformal maps from Subsection 1.3.4 were not assumed to 
be orientation preserving. Here we focus on conformal maps that are holomorphic, which in particular 
means that they are orientation preserving (see (1.25)). 

2 The Riemann sphere is not quite a complex region in the usual sense; technically, it is a Riemann sur- 
face, but we will still count it and trust that you understand how the various definitions apply in that sit- 
uation; refer to Section 1.11. Actually, the same classification questions we are addressing in the context 
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6. the slit plane C \ (—oo, 0] 

7. astrip S(x,, X2) = {Zz € C : 0 < Re(z) < 1} 

8. arectangle {z € C : 0 < Re(z) < 1,a < Im(z) < b} 

9. an annulus A(r4, r2) = {Z € C : r4 < |Z| < r3} 

10. a quadrant {z : Re(z) > 0, Im(z) > 0} 

11. an ellipse {z = x + iy : (4)? + (2) <1} 

12. the plane with an interval removed, C \ [-1, 1] 

13. the upper half-plane with an interval removed, H \ [0, i] 
14. a “blob” (Fig. 3.1) 


Figure 3.1: Two blob-shaped regions. Are they conformally equivalent? 


Can you guess what is the correct grouping of these regions according to conformal 
equivalence? (Note: in example 9 of the annulus, we in fact have a family of regions, 
which may not all be conformally equivalent to each other.) By the end of this chapter, 
you will know the answers. 


Since conformal maps are continuous, the relation of conformal equivalence is a 
stronger notion of equivalence than topological equivalence (a. k. a. homeomorphism). 
We record this obvious but important fact as a lemma. 


Lemma 3.1. Ifregions Q andQ! are conformally equivalent, then they are homeomorphic. 


Next, if regions Q and Q' are conformally equivalent, with the conformal map g : 
Q — Q' relating them, then is g unique? If not, can the extent to which it is not unique 
be made precise? The answer to these questions is described in terms of the automor- 
phism group of a complex region. More precisely, if Z : Q — Q’ is another conformal 
map, then the map A : Q —> Q defined by 


of conformal equivalence apply more generally in the theory of Riemann surfaces. We will encounter 
an interesting example of the classification of a class of Riemann surfaces up to conformal equivalence 
in Chapters 4 and 5; see Sections 4.15, 5.5, and 5.11. 
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is a conformal equivalence map between Q and itself. We call such a map a (conformal) 
automorphism of Q. Conversely, if g : Q — Q' is a conformal map and h : Q > Qisa 
conformal automorphism, then # : Q — Q' defined by 


F=Eeh 


is also a conformal map from Q to Q’, and clearly every conformal map # : Q => Q! 
can be represented in such a way for some automorphism h : Q — Q (just define h 
as above). Thus the family of automorphisms of Q precisely measures the extent of the 
nonuniqueness of the conformal map g : Q > Q! for any Q' that is conformally equiv- 
alent to Q. This family has the algebraic structure of a group, with the group operation 
being composition of maps, and is thus referred to as the automorphism group of Q. We 
denote this group by Aut(Q). We will seek to give explicit descriptions of automorphism 
groups whenever this is possible. 

To conclude this general discussion, we note one additional useful fact about con- 
formal maps. 


Lemma 3.2. In the definition of conformal equivalence, the condition that g™' is holo- 
morphic can be dropped, that is, if g : Q > Q! is holomorphic and bijective, then g7 is 
automatically holomorphic. 


Proof. Since g satisfies g’(z)) + 0 for any Zo € Q, the inverse function theorem (Theo- 
rem 1.56) implies that the inverse map g™* exists locally in a neighborhood of g(z)) as a 
holomorphic function for any Zp € Q. Since g is a bijection, the inverse function exists 
globally (in the sense of set theory) as a function g™' : Q’ — Q. The fact that g™ is locally 
holomorphic implies that the global inverse function g~t is holomorphic, which is the 
claim of the lemma. 


In the next few sections, we begin to classify some of the main conformal equiva- 
lence classes that every complex analyst should be familiar with. The most important 
classification result in this chapter is the Riemann mapping theorem, which is formu- 
lated in Section 3.4. 


Suggested exercises for Section 3.1. 3.1. 


3.2 First singleton conformal equivalence class: the complex plane 


The first conformal equivalence class we discuss contains just a single element, the com- 
plex plane. This is explained by the following theorem. 


Theorem 3.3. Let g : C —> Q bea conformal map between C and a region Q. Then Q = C, 
g(z) is aconformal automorphism, and g(z) has the form 


g(z) =az+b 


for some complex numbers a, b with a + 0. 
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Proof. Let g : C — Q be a conformal equivalence map. We will prove that g(z) is of the 
form g(z) = az + b with a + 0 just based on the assumption that it is an entire function 
and that it is injective; the additional claims that Q = C and g(z) is an automorphism 
will then follow. 

Since g(z) is an entire function, it is either a polynomial, or it is not. We treat each 
of those two cases separately (proving that g(z) is of the desired form in the first case 
and proving that the second case cannot occur). 

If g(z) is a polynomial, it cannot be a constant since those certainly are not injective 
maps. We claim that it also cannot be a polynomial of degree k > 2, which if true would 
leave only the option of a linear function g(z) = az + b with a + 0. The fact that poly- 
nomials of degree higher than 1 are not injective is easy to see: a polynomial of degree 
k has k roots counting with multiplicity, which means that either there are at least two 
distinct zeros (contradicting the assumption of injectivity), or there is a single zero of 
multiplicity k, which means that the polynomial is of the form g(z) = c(z - a)*. This 
polynomial is clearly also not injective since in that case the equation g(z) = 1 has k 
distinct solutions. 

It remains to consider the other possibility of an entire function that is not a poly- 
nomial. In that scenario, we claim that g(z) has an essential singularity at z = oo. For 
otherwise, by our classification of singularities (Section 1.12), g(z) must have a pole of 
some order k at infinity. However, having such a pole implies that the rate of growth 
of |g(z)| is restricted by the order of the pole; specifically, g(z) satisfies a bound of the 
form |g(z)| < A+ Biz|* for all z, where A and B are positive real constants. Now a well- 
known argument from basic complex analysis (Exercise 1.25) implies that g(z) is actually 
a polynomial of degree at most k, which is a contradiction. 

We are now in a good position to apply the Casorati-Weierstrass theorem (Theo- 
rem 1.46) about the behavior of functions near an essential singularity. Denote wọ = 
g(0). Since g(z) is an open mapping by the open mapping theorem (Theorem 1.50), the 
image g(D) of the unit disc under g(z) contains an open neighborhood E of wọ. But by 
the Casorati—Weierstrass theorem the image g(C \ Dzp(0)) of the complement of any 
closed disc around 0 (i. e., any neighborhood of oo) is dense in C and therefore has a 
nonempty intersection with E. This intersection means that there exist points z4 € D 
and Z, € C \ Dp(0) for which 


&(Z) = &(Z). 


Now if R > 1, then z, + z3. We have therefore shown that g(z) is not injective, which 
contradicts our initial assumption. Thus the scenario of a conformal map on C that is 
not a polynomial is impossible, and the proof is complete. 


By Theorem 3.3 the group of conformal automorphisms of C is 


Aut(C) = {zr az +b : a,b € C,a + 0}. 
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3.3 Second singleton conformal equivalence class: the Riemann 
sphere 


There is a second conformal equivalence class that is a singleton, the Riemann sphere. 
The following result is the analogue of Theorem 3.3 for C. 


Theorem 3.4. If g : C > Q is a conformal map between Č and a region Q, then Q = ©, 
g(z) is a conformal automorphism, and g(z) has the form 


az+b 
cz+d 


8(2) = (3.1) 


for some complex numbers a, b, c, d with ad — bc + 0. 


Proof of Theorem 3.4. We start by proving that Q = C. Assume that this is not the case, 
i. e., that there is at least one point w € C that is not in the image g(C). We can assume 
moe loss of generality that w = oo; otherwise, replace the map g(z) with g(z) = 


iow =: Once g(z) is shown to be of the desired form (3.1), solving the equation g(z) = 


for g(z) shows that g(z) is of that form as well. 
Since g(z) does not take the value oo, it also cannot approach infinity, that is, there 
does not exist a sequence (z,)°°, of points in C for which g(z,,) — oo. If such a sequence 
existed, we could use the fact that C is compact to extract a convergent subsequence 
Zn, 72 € C, whence it would follow, since g(z) is a continuous function, that g(Z) = co 
which cannot happen since oo is not in the image of g(z). 

The fact that g(z) does not approach co means simply that g(z) is abounded function 
and a holomorphic one at that (our a priori assumption that allows Q to contain the 
point co only means it is meromorphic). Thus it is a bounded entire function and hence 
constant by Liouville’s theorem, a contradiction. 

Having established that Q = C, we now know that g(z) is a genuine automorphism 
of Č. Denote w = g(oo). Once again, we can pees without loss of generality that w = 
co; otherwise, replace the map g(z) with g(z) = iow as before. Under this assumption, 
the restriction of g(z) to C is a conformal automorphism of C, so from the discussion in 
the previous section we know that g(z) is of the form az + b for some a, b € C,a + 0. 


1 
g(z)-w 


By Theorem 3.4 the group of conformal automorphisms of C is 


ute abcd e C,ad— be +0} (3.2) 


Aut(C) = fz 


The elements of this group are known as Mobius transformations. An important and 
easy-to-check property of such transformations is that they act as 2 x 2 linear transfor- 
mations; more precisely, given two Mobius transformations 


QZ + bı and T,(z) = QZ + b, 


T,(Z) = A 
1 &Z + dy CoZ + dy 


(3.3) 
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their composition is given by 


_ aZ+ B 
(Ti ° Ty)(Z) = ae (3.4) 


where a, p, y, 6 are the entries of the matrix 


Ga ale a): 69 
y 6 ca d/G d 
For this reason, Möbius transformations are also known as fractional linear transfor- 
mations. 

The group (3.2) is also sometimes referred to as the projective linear group (of 
order 2 over the complex numbers) and denoted PSL(2, C). The reason for this termi- 


nology is as follows. If we define the special linear group (of order 2 over the complex 
numbers) by 


a b 


SL(2, C) = \(° A : a,b,c,d € C, ad-bc=1}, 


then we can easily check that the association mapping a matrix (¢ 2) € SL(2, C) to the 


Möbius transformation z +> ath is a surjective group homomorphism, which has the 
subgroup {+( 4 ? )} as its kernel. Thus, by the first isomorphism theorem in group theory, 


the group Aut(C) can be identified with the quotient group 


SL(2, C)/{+(4 9 )}. 


The quotienting operation in this context is often referred to as projectivization, which 
leads to the name projective linear group both for the quotient group and the occasional 
use of the same name and notation for the group of Möbius transformations. 

The group PSL(2, C) is an important group in mathematics and even has interesting 
connections to physics; see the box overleaf. 


Suggested exercises for Section 3.3. 3.2. 


3.4 The Riemann mapping theorem 


We have seen two conformal equivalence classes consisting of a single element each. 
Obviously, if all other equivalence classes were also singletons, the situation would be 
extremely boring, and the notion of conformal equivalence would not even deserve its 
own name. It is easy to see however that the true situation is, at least, more complicated 
than this simplistic scenario (see Exercise 3.3). 
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The group PSL(2, C) and the night sky of a relativistically 
moving observer 


Suppose you get into a spaceship and speed away from Earth, reaching a velocity of ac, where c is the 
speed of light, and the fraction a is substantial (say, higher than 5%). We know from science fiction 
movies that your view of the stars as you peer through the spaceship window will appear distorted. But 
how, exactly? This problem has a delightful connection to complex analysis and the automorphism group 
PSL(2, C) of the Riemann sphere. In fact, your view of the celestial sphere of stars gets transformed by a 
Möbius transformation acting on the celestial sphere precisely as if it were the Riemann sphere. 

Mathematically, the connection is roughly as follows: it is well known from the theory of special 
relativity that an observer moving at relativistic velocity v relative to the Earth (which for the sake of 
discussion we assume is an inertial frame of reference) will have their time and space coordinates trans- 
formed from the Earth’s time and space coordinate system according to a type of linear transformation 
known as a proper, orthochronous Lorentz transformation. The group of such transformations can be 
represented as the group of 4 x 4 real matrices 


L! = ÍT € Maty.,(R) : det(T)=1, T,y <0, TXT = X}, 


where X is the 4 x 4 diagonal matrix with diagonal entries —1, 1, 1, 1. In fact, it can be shown that Z? is iso- 
morphic to PSL(2, C) and that the isomorphism p : Ll — PSL(2, C) is such that for the moving observer 
with a given associated Lorentz transformation T, the distortion of the moving observer’s celestial sphere 
relative to the celestial sphere of the static frame of reference is described precisely by the Mobius trans- 
formation p(T), under the obvious identification between the celestial sphere and the Riemann sphere. 
See [53, Appendix B] and [55, Ch. 1] for the details of this surprising result. 


On this optimistic note, it looks like there ought to be some interesting phenomena 
for us to explore. This brings us to one of the most fundamental results on conformal 
mapping, the Riemann mapping theorem, which identifies the first nontrivial confor- 
mal equivalence class and the one that undoubtedly plays the most central role in com- 
plex analysis. 


Theorem 3.5 (Riemann mapping theorem: simple version). Let 2,2’ c C be simply con- 
nected complex regions with Q, Q' + C. Then Q and Q' are conformally equivalent. 


As an immediate corollary, we get an interesting result in topology, an illustration of 
the principle that the often symbiotic relationship between complex analysis and topol- 
ogy involves a flow of ideas in both directions. 


Corollary 3.6. Any two simply connected regions in the plane are homeomorphic. 


This well-known result can also be proved without the use of complex analysis. 
See [W17] for a related discussion. 

To prove Theorem 3.5, we will need to develop some new theoretical ideas (which 
are also interesting in their own right and are of broader applicability). A more precise 
version of the theorem is stated in Section 3.7. 

Tangentially to that effort, we also wish to understand the structure of the auto- 
morphism groups Aut(®) for regions Q belonging to the conformal equivalence class 
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described by the theorem. By Exercise 3.1 all such groups are isomorphic in such a way 
that the isomorphism between any two can be described in terms of conformal equiv- 
alence maps g : Q —> Q' relating different class members. Thus, to understand the 
automorphism groups, it is in fact sufficient to classify the automorphisms for just one 
representative member of the class. There are two fairly canonical choices for such a 
member, the unit disc D and the upper half-plane H (and those two are easy to relate to 
each other, though doing so is still interesting). We discuss these regions in the next two 
sections. 


Suggested exercises for Section 3.4. 3.3. 


3.5 The unit disc and its automorphisms 


The next result, known as the Schwarz lemma, is a simple yet powerful result about 
holomorphic functions from the unit disc to itself that keep the origin fixed. It is an 
important tool on the path to characterizing the automorphisms of the unit disc. 

If g : D > D, then we say that g(z) is a rotation map, or simply a rotation, if it is 
of the form g(z) = e?z for some 0 € [0, 272). 


Lemma 3.7 (The Schwarz lemma). Let g : D > D be a holomorphic function that satisfies 
g(0) = 0. Then: 

lg(Z)| < |z| for all z € D. 

2. If|g(z)| = |z| for some z + 0, then g(z) is a rotation. 

3. IZO) <1. 

4. If \g'(0)| =1, then g(z) is a rotation. 


ma 


Proof. Since g(z) has a zero at z = 0, we know that it satisfies |g(z)| < C|z| for some 
C > 0 and all z in some neighborhood of 0. This is a weaker inequality than the one we 
are trying to prove, but in fact it is a helpful observation, as it can be restated as the 
claim that h(z) = g(z)/z satisfies |h(z)| < C for all z € D \ {0}; that is, h(z) is bounded ina 
punctured neighborhood of 0 and of course holomorphic there. By Riemann’s removable 
singularity theorem (Theorem 1.38), h(z) therefore has a removable singularity at 0 and 
can be extended to a holomorphic function on all of D (which we still denote h(z), as 
per the usual convention when talking about analytic continuation). Now let z € D \ 
{0}, and let r be a real number with |z| < r < 1. By the maximum modulus principle 
(Theorem 1.51) the maximum modulus of h(z) in the closed disc of radius r around 0 is 
attained at the boundary of that disc. Therefore we have that 
mag KOEI 1 
Z r 


= |h(z)| < max|h(w)| < max |h(re")| = max < 
Įw|<r O<t<27 O<t<27 r 


(In the last step, we used the fact that g (z) maps D into itself, so |g(w)| < 1for allw € D.) 
Since this is true for all |z| < r < 1, we then have that 
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g(2) 


Z 


: 1 
< inf -=1, 
|zl<r<1 F 


that is, |g(z)| < |z|, which was the first claim of the lemma. Now claim 3 also fol- 
lows by taking an additional limit of these inequalities as z — 0, since |g’(0)| = 
[lim, „o £20] = lim, 121. 

Now, for the claim 2, note that an equality for some z € D in the bound |h(z)| < 1 
means that |h(z)| attains its maximal value in the interior of the disc. By the condition 
for equality in the maximum modulus principle, h(z) must be a constant, which is of 
unit magnitude (since we know that |h(z)| = 1 for some z). That is, we have shown that 
h(z) = e for some 0 or, equivalently, that g(z) is a rotation, giving claim 2. 

Similarly, for the fourth claim, if 1 = |g'(0)| = lim,_,o 2] = lim,_,9 lh(z)| = |A(0)I, 
then again we see that |h(z)| attains its maximum value in the interior of the disc (in this 
case at z = 0) and infer using the same argument as above that g(z) is a rotation. 


Corollary 3.8 (Automorphisms of the unit disc that fix 0). The automorphisms g : D — D 
of the unit disc that fix 0 (that is, satisfy g(0) = 0) are precisely the rotations. 


Proof. Obviously, a rotation is a conformal automorphism of D that fixes 0. Conversely, 
let g : D — D bean automorphism that fixes 0. Then both g(z) and its inverse func- 
tion g~1(z) satisfy the assumptions of the Schwarz lemma. It follows that |g(z)| < z and 
| g (w)| < w for all z,w € D; or, setting w = g(z) for an arbitrary z € D in the second 
inequality, 


|g(z)| <zand|z|< |g(z)| =  |g(z)|= zl 


for all z € D. By part 2 of the Schwarz lemma, g(z) is a rotation. 


We can now exhibit a more general two-parameter family of automorphisms of D, 
which are obtained by composing rotations with an additional family of automorphisms 
that do not fix 0. As a first step, for w € D, we define the Möbius transformation 

w-Z 
Z) = ——.. 3.6 
Pw(2) = z (3.6) 
Lemma 3.9. The transformation @,, is an automorphism of D. Moreover, it has the fol- 
lowing properties: (a) @,,(0) = w; b) @,,(W) = 0; (9 @; = Py. 


Proof. Properties (a)—(c) are trivial to check through a direct calculation, which I leave 
as an exercise. For the claim that 9,, is an automorphism, note that if |z| = 1, then 


lw-z|  |w-z| [wz] [wee] 
l1-wz| |1-wWz|- |Z| |z-wzz| |z-w| ~ 


|oy(z)| = 


Thus @,, maps the unit circle into itself. It is also injective (as a meromorphic function 
on C) since it is a Mobius transformation. Therefore either g maps the unit disc D into 
itself and maps the complement D = {|z| > 1} of the closed unit disc into itself, or 9, 
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maps D into D and maps D into D. However, we know that 9,,(0) = w and w € D, so 
that rules out the latter possibility. Finally, since we have established that ọ„(D) c D, 
and we know that 0, = @,, the mapping of D into itself by ọ is bijective, and ọ„ is a 
conformal equivalence. 


The composition of an arbitrary member of the family of rotations (specified by a 
real-valued parameter 6 € [0, 27r)) and an arbitrary member of the family 9,,, specified 
by the point w € D, is a map of the form 


b W= 


AEE 
1- wz 


It turns out that all automorphisms of the unit disc are of this form. This is the well- 
known characterization of the automorphism group Aut(D), given in the following the- 
orem. 


Theorem 3.10 (Automorphisms of the unit disc). A function g : D — D is an automor- 
phism of D if and only if it is of the form 
w-Z 


1-wz G7 


g(z) = e? 


for some 0 € [0, 27) and w € D. The pair (6, w) in this representation is unique. 


Proof. The “if” part was already explained above. To prove the “only if” claim, let g : 
D —> D be an automorphism. Denote w = g™1(0) € D, and let h = g o ọọ. As the 
composition of two automorphisms of D, h(z) is itself an automorphism of D. It also 
leaves z = 0 fixed. By Corollary 3.8 it is a rotation and can be expressed as h(z) = e®z for 
some 0 € [0, 27r). Therefore g(z) = (he @,,)(zZ) is of the desired form (3.7). 

For the uniqueness claim, note that (3.7) implies that w = g (0), which determines 
w uniquely for a given automorphism g. Now if w + 0, then we have g(0) = ew, which 
can be written as e? = g(0)/w, and thus 0 is also determined uniquely from the map g. 
In the second case where w = 0, we are back to the scenario of an automorphism that 
fixes 0, which we have seen must be a rotation g(z) = e?z, with 0 again clearly being 
uniquely determined. 


An alternative, but less frequently used, characterization of the automorphisms of 
the unit disc is given in the next result. The proof is left as an exercise (Exercise 3.4). 


Theorem 3.11 (Automorphisms of the unit disc: alternative representation). A function g : 
D > Dis an automorphism of D if and only if it is of the form 


82) == (3.8) 


for some u,v € C satisfying |u|" — |v = 1. The pair (u, v) is unique. 
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The explicit description of the automorphisms of D in terms of the representa- 
tions (3.7)-(3.8), involving formulas that one rarely encounters outside of complex anal- 
ysis, masks the fact that the group of such automorphisms bears a close relationship 
with a standard matrix group you may be familiar with from linear algebra, the theory 
of Lie groups, topology, and other areas. As we will see in the next section, the connec- 
tion becomes apparent when we switch from the unit disc to its “conformal sibling,” the 
upper half-plane. 


Suggested exercises for Section 3.5. 3.4. 


3.6 The upper half-plane and its automorphisms 


Lemma 3.12. The unit disc D and the upper half-plane H are conformally equivalent. The 
pair of maps ® : H —> Dand¥: D > H given by 


oas — a Se (3.9) 
Z+1 Z 


give an explicit pair of mutually inverse conformal maps mapping each of the regions onto 
the other. 


Proof. Note that if z = x + iy, then |®(z)|? = ei = wa. which is < 1 if and only if 
Im(z) = y > 0 (the geometric meaning of this statement is simply that ®(z) is the ratio 
of the distances of z to i and —i, and the upper half-plane is precisely the locus of points 
that are closer to i than to —i). Thus ® maps H into D and the complement of H into the 
complement of D. Since we know that ® is a conformal map when regarded as a map 
from C to itself, this is enough to imply that it maps H surjectively and conformally onto 
D. Finally, it is trivial to verify by direct calculation that the inverse map to ®(z) is given 


by the formula defining ¥(z). 


Theorem 3.13 (Conformal automorphisms of the upper half-plane). A function g : H —> 
H is a conformal automorphism if and only if it is of the form 


(3.10) 


for real numbers a, b, c, d satisfying ad — bc = 1. The numbers a, b,c, d in this representa- 
tion are unique up to a single choice of sign, in the sense that if a,b,c,d and a',b',c', d' 
are coefficients in two distinct representations, then (a’,b’,c', d') = +(a, b, c, d). 


Proof. “If”: assume that g(z) has the stated form (3.10) with a, b, c, d real and ad -bc = 1. 
As we already know from Theorem 3.4, g(z) is a conformal automorphism of C. More- 
over, since a, b, c,d € IR, we have 
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m( 2**) a m( Get) 
cz+d jez + dl? 

ad — bc 
jez + dl? 


Im(ac\z|* + bd + adz + bcZ) = Im(z). (3.11) 


~ lez +d 
This immediately implies that Im(g(z)) > 0 if and only if Im(z) > 0, that is, g is an 
automorphism of H. 
“Only if”: assume that g € Aut(H). Then f = ®o go Wis an automorphism of the 
unit disc, where ® and © are given in (3.9). By Theorem 3.11, f can be expressed as 
UZ +v 
Z) = = 
are 
for some u, v € C with |u|" — |v|? = 1. To calculate what this means for g = Y o f o ®, we 
switch to the notation of matrix multiplication, which, as we know from (3.3)-(3.5), is 
a way to represent the action of Mobius transformations. The matrices associated with 
the action of ©, Y, and f are 


s cla er, 


Therefore the map ¥ of o Ẹ is represented by the matrix product 


w-( DE DC a) 
1 -l/\v u/M i 

More explicitly, if we denote u = x + iy and v = u + iv to represent u, v in terms of their 
real and imaginary parts, then this matrix product is 


Fo = ic. n ce iy u+ r) (| 7) 
1 -l/\u-w x-iy/\1 i 
a 2) =a (4 Ai 
y+v -x+u c d 

The numbers a, b, c, d thus defined are real, and moreover it is easy to check that ad-bc = 
1 (hint: determinants). Note that the scalar factor 2i multiplying the matrix is irrelevant 
when we go back to considering g as a Mobius transformation instead of a matrix, that 
az+b 


is, we see that g(z) is indeed of the form a with a,b, c,d as claimed in the theorem. 


The automorphism group 


Aut(JH) = fen ie 
cz+d 


: a,b,c,d € R, ad - be = 1} 


is known as the projective special linear group (of order 2 over the real numbers) 
and sometimes denoted PSL(2, R). By the natural association between 2 x 2 matrices and 
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Mobius transformations discussed in Section 3.3, it can be identified with the quotient 
group 
SL(2, R)/{+I}, 


where SL(2, R) is the special linear group of order 2 over R (the group of invertible 2 x 2 
real matrices with determinant 1), and {+7} is its subgroup with two elements containing 
the identity matrix and its negation. 


3.7 The Riemann mapping theorem: a more precise formulation 


We formulated in Section 3.4 a version of the Riemann mapping theorem that identi- 
fies an interesting conformal equivalence class of complex regions. Conceptually, this is 
what I regard as the main content of the theorem. Note that this formulation is carefully 
“neutral” in the sense of not singling out any member of the equivalence class as being 
more important or worthy of attention than others. However, in practice, we already 
discussed the fact that the unit disc and upper half-plane are each in their own way 
somewhat canonical members of the class. By contrast, other member regions such as, 
say, the unit square, seldom play a particularly important role in the theory, although 
from a purely geometric point of view, they may be just as natural, and they may appear 
in specific applications. 

Furthermore, as we inch our way toward a proof of the theorem, it does in fact 
become convenient to fix a specific member of the class—the unit disc—as the target 
region for the conformal maps we will construct. Another small conceptual advance 
is to add more information about the conformal map mapping a given region 2 to D 
so as to ensure uniqueness. This leads us to the following more detailed version of the 
theorem. 


Theorem 3.14 (Riemann mapping theorem: detailed version). Let Q c C bea simply con- 
nected complex region with Q + C, and let zy € Q. Then there exists a unique biholomor- 
phism F : Q — D with the property that 

1. F(z) =0 

2. F'(Zq) is a positive real number. 


Proof of uniqueness. Let F} and F, be two biholomorphisms with the properties de- 
scribed in the theorem. Then the conformal map ® = F, o F} 1 is an automorphism of D 
that fixes 0, so by Corollary 3.8 it is a rotation, that is, of the form ®(z) = az for some a 
with |a| = 1. On the other hand, the constant a can be expressed as 


’ F, 
a = 0'(0) = FEOF 0) = P, 
1\“0 


which shows that it is a positive real number. It follows that a = 1 and ®(z) = z, that is, 
F = Fy. 
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The history of the Riemann mapping theorem 


The Riemann mapping theorem was formulated by the great Bernhard Riemann in 1851 as part of his 
PhD thesis. Riemann stated the result for regions with a piecewise smooth boundary and gave a proof 
that contained useful ideas but was later realized to be flawed. Later nineteenth-century mathematicians 
worked hard to fill in the gaps in Riemann’s argument, with varying levels of success. The first proof con- 
sidered to be fully correct by modern standards was given by Osgood in 1900. Osgood’s proof, like others 
before it, relied on the “potential-theoretic” approach (related to Dirichlet’s principle and the study of 
Laplace’s equation) advocated by Riemann rather than on ideas of a more conceptually complex-analytic 
nature. This approach, while interesting, has since fallen out of fashion as an approach to proving the 
Riemann mapping theorem because of various technical shortcomings it has. 

The proof of the theorem we present in Sections 3.8-3.9 is described in Walsh’s historical survey [72] 
as the “standard modern proof.” You will find it described in most complex analysis textbooks, as it ap- 
pears to be the simplest proof known today. For additional details on the interesting history of Riemann’s 
famous theorem and the ideas developed out of it, see the historical reviews [33, 72]. 


The more difficult part of Theorem 3.14 is the existence claim. As we will see, the 
key insight needed for the proof is that the problem of mapping Q conformally to D can 
be formulated as a maximization problem for a certain functional. Specifically, in the 
class F consisting of all the injective maps from Q into D that map z, to 0 and for which 
F' (Zo) is a positive real number, we will see that the one map that is also surjective (and 
thus establishes the required conformal equivalence of Q to D) is the one for which 
the number F’ (Zo) is maximal. This will be shown in a somewhat constructive way by 
arguing that if F(z) is not surjective, then we can exploit the point that is “missing” from 
the image to produce a new conformal map G : Q — D with a larger value of G’(Z). 
Although the basic idea of how this is done is fairly simple (see Lemma 3.21), there are a 
few technical issues that need to be addressed to turn it into a complete proof, namely 
showing that the class F is nonempty, that the functional F + F’ (zo) attains a maximum, 
and so on. The details are given in the next two sections. 


3.8 Proof of the Riemann mapping theorem, part I: technical 
background 


In this section, we prove a few auxiliary results needed for the proof of the Riemann 
mapping theorem. Two of the results, Montel’s and Hurwitz’s theorems, are theorems 
in complex analysis. The third, the Arzela—Ascoli theorem, is a theorem in real analysis. 
Let F be a family of complex-valued continuous functions on a complex region Q. 
We say that F is locally uniformly bounded if for any compact set K c Q, we have 


sup |f(z)| < co. (3.12) 
JEF, zeK 


We say that F is locally uniformly equicontinuous if for any compact K c Q and any 
€ > 0, there exists 6 > 0 such that 
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if 2,2, € K and |z,-Z |< 6, then  sup|f(z,)-f(z,)|<e. (3.13) 
JEF 


The following is a version of the well-known Arzela—Ascoli theorem, a staple of real 
and functional analysis, slightly adapted to our setting. 


Theorem 3.15 (Arzela—Ascoli theorem). Let F be a family of continuous complex-valued 
functions on Q. Assume that the family is locally uniformly equicontinuous and locally 
uniformly bounded. Then any sequence (fy) p21 of functions in F has a subsequence (fh, )g1 
that converges uniformly on compacts in Q to some continuous function f. 


Proof. Let Q = (Zm)m-1 be a dense countable set of points in Q (ordered as a sequence 
according to some arbitrary enumeration). The sequence (f,(Z;))p-1 is a sequence of 
complex numbers taking values in a compact set {|z| < M,}, where we denote M, = 
supre x If(Z,)| < co (guaranteed to be finite by (3.12)). By compactness this sequence 
therefore has a convergent sequence, which we denote by (f\(z,))°°, (instead of the 
more traditional subsequence notation fn, (Z;)). That is, f° is the notation for the nth 
function in the extracted subsequence of the original sequence of functions (f;,(Z))n- 

Now we extract a further subsequence of this subsequence, noting that the sequence 
(ag (Z2))po1 is a sequence of complex numbers taking values in a compact set {|z| < M3}, 
where 


M,= sap fol. 
fEF, ZzE{21,Z3} 


(Again, the local uniform boundedness assumption guarantees that M, < co.) So 
again by compactness, this sequence has a convergent sequence, which we denote 
by FPD 

Continuing in this way, we proceed to successively extract nested subsequences 
FL, FAOL... of the original sequence of functions, where each subsequence is 
extracted as a further subsequence of the previous one. These subsequences have the 
property that for each j > 1, the jth sequence cf? nr İs a subsequence of the original 
sequence (f;,), for which f Gm) converges to a limit as n — oo form =1,2,...,). 

Now consider the “diagonal” sequence in this nested sequence of subsequences: 
we let g, = f™®. Then (g,)°°, is a subsequence of (f,,), with the property that g,,(Zm) 
converges to a limit as n > oo for all m > 1. 

We claim that the sequence of functions (g,(Z));2, converges uniformly on compacts 
in Q. Let K c Q be compact, and let £ > 0. Let 6 > 0 be a number, guaranteed to exist by 


the assumption of local uniform equicontinuity, with the property that 


if Z} Z, € K and |z,-z,|<6, then  supl|f(z,)—f(z,)| < s 
feF 3 


(Compare with (3.13): we merely replaced € there with ¢/3, with the usual goal in 
mind that some other bound later will end up smaller than €.) The containment 
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K C UgexDs/2(@) gives an open covering of K, which by compactness has a finite sub- 
covering Ds r Select a point Zy of the countable dense set Q from each of the 
subcovering discs Ds/2(g). For any 1 < j < q, (Sk(2v,) ee is a convergent sequence or, 
equivalently, is a Cauchy sequence; therefore there exists an index N; > 1 such that 


E 
ISe(Z,,) - &(2,)| < 3 


whenever k,£ > Nj. Set N = max(Nj,Np,...,Nq). Then for any w € K, we have that 
w € Dsj2(¢j) € Dg(2y,) for some 1 < j < q. It follows that, for k, £ > N, 


Ige(w) = 8.(W)| < (gew) = Se(Z,,)| + BeZr,) - Sx (Z,)| 
E E € 

T Isx(Z,) - g(w)| < gg ee 

This establishes that (g;,(z))y2, is a Cauchy sequence uniformly on K and hence (by a 
standard fact from real analysis) converges uniformly on K. The compact K was arbi- 
trary, so we proved the existence of a subsequence that converges uniformly on com- 
pacts; the fact that the limiting function must be continuous is standard, and the proof 
of the theorem is complete. 


Returning to the realm of complex analysis, we now introduce the concept of a nor- 
mal family of functions. Let Q be a complex region as before. A family F of holomor- 
phic functions on Q is called normal, or a normal family, if every sequence (f,);2, in 
the family has a subsequence (fn, );2., such that fp, converges uniformly on compacts to 
a holomorphic function g. 


Theorem 3.16 (Montel’s theorem). Let F be a family of holomorphic functions on a region 
Q that is locally uniformly bounded. Then F is a normal family. 


Proof: We claim that the added assumption of holomorphicity of the members of F, 
together with local uniform boundedness, implies that the family is uniformly locally 
equicontinuous. Once we show this, the Arzela—Ascoli theorem will imply that every 
sequence (F;,);2, of elements in the family has a subsequence F,,, that converges uni- 
formly on compacts to a limiting function F. Then it would follow that F is holomorphic 
by standard properties of uniform convergence on compacts (Theorem 1.39 on p. 45), 
and we would be done. 

We start by showing a weaker version of the required property that does not include 
uniformity over compact subsets. Fix a point a € Q anda radius p > 0 such that D(a) c 
Q. Later we will need to emphasize the dependence of p on a, so we will then denote it 
by p(a). If z,,Z, € D,(a), then by Cauchy’s integral formula we have, uniformly over all 
JEF, 


1 1 1 
If (21) —f (Z2)| = fow)( 7 wn) 


|w—a|=2p 
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Z~ 2p d fw) 


2m (w -= Z)(W -= Z3) 
|w—a|=2p 
1 1 2M 
< —|Z,-Z,|- su (w)| - 27(2p)— < —|z,- Z3], (3.14) 
ant o? ee | p’ P Bor 


where we denote M = supper, w-al-2p If(w)|, a finite number by the local uniform 
boundedness assumption. 
Now fix a number e > 0. If we define the number 


n= min( p, 2) > 0, 


then by (3.14) we have the property that 


if 21,2, €D,(a), then sup|f(z;) - f (z2)| < €. (3.15) 
feF 


This is the nonuniform local equicontinuity property alluded to above. Note that the 
parameter 7 depends on the point a, so we will now redenote it by n(a) to emphasize 
this dependence. (7 also depends on e, but the value of e will remain fixed throughout 
the discussion.) 

Finally, we can derive the uniform-over-compacts version of local equicontinuity. 
Let K c Q be a compact set, and let £ > 0 be the same as above. Consider the covering of 
K by open sets given by 


Kc U Diya)/2(@)- 


acK 


By compactness there exists a finite subcovering 


n 
K c |J Dyana) 
jl 


for some points a,,...,a, € K. Denote ô = ; min(n(a,), ..., N(an)). Then we claim that 
for all z4, Z3 € K such that |z; - zZ,| < 6, 


sup|f (z1) - f (Z2)| < €. (3.16) 
FEF 


Indeed, z, must belong to Diva) j2(q;) for some 1 < j < n by the defining property of the 
subcovering. This also implies that 


na) na) _ ma) 
2 2 2 


IZ, - @j| < |Z2 — Z| + |Z, - aj| < ô + 


= n(a;), 
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so altogether we see that both z,,z, are in Dra (0). Relation (3.16) therefore follows 
from (3.15). To summarize, we proved that for any compact set K c Q and e > 0, (3.13) 
is satisfied without choice of 6 as defined above; this proves that the family F is locally 
uniformly equicontinuous and concludes the proof of the theorem. 


Theorem 3.17 (Hurwitz’s theorem). Let Q c C be a region, and let (f,(z))"2, and g(z) be 
holomorphic functions on Q such that f,(z) > g(z) uniformly on compacts in Qasn — oo, 
where g(z) is not the zero function. Ifz,) € Qisa zero of g(z) of order k > 0, andD,(Z)) c Q 
is a disc centered at Zo such that the punctured closed disc Dz,(Zq) \ {Zo} contains no 
zeros of g(z), then for any large enough n, f,,(Z) has precisely k zeros in D,(Z9) counting 
multiplicities. 


Proof: Recall that by the argument principle the order k of the zero of g(z) at z) can be 
expressed as the contour integral 


I 
pes EO) gz (3.17) 


|Z-Zol=r 


Denote by x,, the number of zeros of f (z) in D,(Zo) counting multiplicities. We wish to 
express K„ similarly as a contour integral over the same circle. This can be done but 
requires first checking that f,(z) does not have any zeros on the circle, which is indeed 
true for large n. Let M = inf,,_,,\-, |g(Z)| and note that M > 0 by the assumption that 
g(z) has no zeros in the punctured disc D.,(Z) \ {Zo} and, in particular, on the circle. By 
the uniform convergence of f (z) to g(z) on the circle there exists an index N > 1such 
that for alln > N, inf\,2,|=r lfn(Z)| = M/2, so that, in particular, f,,(z) also does not have 
any zeros on the circle |z — Z)| = r as we wanted to show. Thus we have the expression 


po dz (3.18) 


for alln > N. 

Note also that on the circle |z—Z)| we have not only the uniform convergence f,,(Z) > 
g(z), but also that of the derivatives f; (z) > g'(z) (recall Theorem 1.39). Combining those 
facts, we deduce also that 


fu(2) g'(2) 
frz) e gaz) 
uniformly on the circle |z — Z,| = r. Finally, this, together with (3.17) and (3.18), implies 
that 
1 t 
Be Sh oe pe of ey 
2m Sr) noo 27 g(Z) 


|z-Zol=r |Z-Zy|=r 


Since k and K, are all integers, it follows that x, = k for all sufficiently large n. 
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Corollary 3.18. Let Q c C be a region, and as in Hurwitz’s theorem, let (f,(z))p2, and g(z) 
be holomorphic functions on Q such that f,,(z) > g(z) uniformly on compacts in Q. If the 
functions f,,(z) are all injective, then g(z) is either injective or a constant. 


Proof. Assume by contradiction that g(z) is not injective and also not a constant func- 
tion. Then there exist distinct points a,b, € Q for which g(a) = g(b). We have the con- 
vergence f,(a) — g(a), and so, if we define functions #(z) and 9,(z),n = 1,2,..., by 


p(Z) = &(Z) - g(a), Pn(Z) = fn(Z) - fn(a), 


then 9,(z) — W(z) uniformly on compacts in Q. Moreover, (z) is not the zero func- 
tion. Therefore we are in a position to apply Hurwitz’s theorem. Specifically, note that 
w(b) = 0, and denote the order of the zero at b by k > 1. Letr > 0 be such that the 
punctured closed disc D-,-(b) \ {b} does not contain any other zeros of w(z) (so, in partic- 
ular, it does not contain the point z = a). Applying Hurwitz’s theorem, we conclude that 
for all sufficiently large n, @,,(z) has at least one zero in the disc D,.(b). However, this is 
impossible, since @,,(z) already has one zero at z = a and was assumed to be an injective 
function. We have reached a contradiction, and the proof is complete. 


Suggested exercises for Section 3.8. 3.5, 3.6. 


3.9 Proof of the Riemann mapping theorem, part II: the main 
construction 


From now on, let Q be a simply connected complex region with Q + C and Zp € Q, as in 
the statement of Theorem 3.14. 


Lemma 3.19. There exists an injective holomorphic function G : Q > D. 


Proof: We know that Q is not the entire complex plane, so take some point a € C\®. The 
function z + z — a has no zeros on Q, so, since Q is simply connected, by Theorem 1.53 
there exists a branch of the logarithm function of z — a on it, that is, a holomorphic 
function h(z) such that e”” = z —a for all z € Q. 

Fix an arbitrary point £ € Q, and define a function G : Q —> C by 

G(z) = ae ee (3.19) 
h(z) — h(B) - 2m 

We claim that G(z) is holomorphic, injective, and bounded on Q; this would imply that 
its scaled version F(Z) = cG(Z) is injective and maps into D if c is a small enough positive 
constant, which would prove the result. 

To establish these properties of G(z), note first that h(z) is injective, since h(z) = h(w) 
implies z - a = e" = e™™) = w-—a,soz = w. Clearly, G(z) = G(w) also implies 
h(z) = h(w), so similarly implies z = w, which shows that G(z) is injective. 


138 — = 3 Conformal mapping 


Now the claim that G(z) is bounded is equivalent to the claim that 
inf|h(z) — (h(B) + 2zi)| > 0. 
ZEQ 


Assume by contradiction that this is not true. Then there is a sequence (z,);-, of points 
in Q such that h(Z,) Paes h(B) + 27i. Exponentiating, we get that 

— Co 
hy) nperi _ AP) B a, 


Zn-4 =“ —— e 


N> 


In other words, Z„ converges to 8 as n — oo. However, then we would have that h(z,,) 
converges to h(f) and not to h(f) +271. This gives a contradiction and finishes the proof. 


Now define the family of functions 
F ={F:Q—D.: F(z) is holomorphic and injective, F (Zo) = 0}. 


The family F is not empty: if G(z) is an injective holomorphic function G : Q > D 
guaranteed to exist by Lemma 3.19, then clearly F(z) = c(G(z) — G(z,)) is an element of 
F if c is a small enough positive number. Define the number A € [0, co] by 


A = sup|F'(Z9)|. 
FeF 


Lemma 3.20. 0 < À < œ. 


Proof. Let F € F. To bound |F’(z,)| from above, observe that, by the Cauchy integral 
formula, ifr > 0 is a number for which the closed disc D.,-(Zg) is contained in Q, then 


ee Fw) 
|F col- [za d TER dw 


1 1 1 
< 27r sup|F(w)| < = 
aT 2 4 ( |s; 


weQ 


since F maps into the unit disc. Since this is true for all F € F, we get that A < L, On the 
other hand, we claim that |F’ (Zo)| > 0, which would show that > 0. Indeed, if F' (Zo) = 0, 
then F(z) has a zero of order at least 2 in Zp. By Corollary 1.58, F(z) is not locally injective 
in any neighborhood of Zp, in contradiction to the fact that F is injective. Thus |F’(Z9)| 
must be positive. 


We now come to the most important lemma of this section, which contains the key 
idea behind our proof of the Riemann mapping theorem. 


Lemma 3.21. Given F € F, if F(Q) ¢ D (that is, the image of Q under F does not cover all 
of ID), then there exists G € F for which |G' (Z))| > |F'(Zo)|- 


3.9 Proof of the Riemann mapping theorem, part II —— 139 


Proof. Take some w € D \ F(Q), known to exist by the assumption. Since w is not in the 
image of Q under F, the point 0 is not in the image of the composed map 9,, °F : Q > D, 
where (recall from (3.6) and Lemma 3.9) @,,(z) = nn is the standard automorphism of 
D mapping 0 and w to each other. Since @,, ° F does not take the value 0 and is defined 
on a simply connected region, by the construction of nth root functions described in 
Section 1.15 there exists a holomorphic branch of its square root, that is, a holomorphic 


function S : Q > D satisfying 


S(z)” = (Py ° F(z). (3.20) 
Now define G : Q — D by the composition 
G(z) = (Ps) ° S)(2)- (3.21) 
We claim that G(z) has the properties claimed by the lemma. First, 
G(Zp) = (siz) ° S)Zo) = Psiz,)(SZo)) = 0. 


Second, note that S(z) is injective since its square is injective as a composition of two 
injective maps. Therefore G(z) is also injective. Both of those facts together show that 
GEF. 

Third and crucially, we wish to show that |G’ (z))| > |F’(z)|. To this end, note that 
by (3.20) and (3.21), F(z) can be represented in terms of G(z) as 


F(Z) = Pulls) ° OZP). (3.22) 


(This is a key relation that deserves to be digested properly. Take a minute or two to 
unwrap all the horrible notation and convince yourself that this relation is correct, and 
see if you can find some deeper meaning here.) Alternatively, if we define the function 
W:D—Dby 


WZ) = Pulse 2) 
then (3.22) can be rewritten as 
F(z) = (W oG)(z). (3.23) 
Note that 
W(0) = Pw(Ps(2q)(0)) = Pw(SZo)”) = Pw(Pw(F(O))) = FO) = 0. 


Thus W(z) satisfies the assumptions of Schwarz’s lemma, and we conclude that |W’ (0)| < 
1, and in fact the strict inequality |W’ (0)| < 1 holds, since W(z) is clearly not a rotation. 
This is what we want, since by (3.23) 
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|E” (Zo)| = [W" (G(Zo))G' (Z9)| = |W") - |G" (Zo), 


which gives the desired conclusion that |G’(z,)| > |F'(Zo)|. 
Lemma 3.22. The family F is anormal family. 


Proof: The functions in F all map into the unit disc, so they are uniformly bounded, and 
a fortiriori locally uniformly bounded. By Montel’s theorem, F is normal. 


Lemma 3.23. There exists an element F € F for which |F' (Zo)| = A, that is, the functional 
G+ |G'(Zp)| attains a maximum in the family F. 


Proof. Let (F,)r°, be a sequence of elements of F such that we have the convergence 
|E! (Zo)| — A. By Lemma 3.22 there is a subsequence (Fn, x1 that converges uniformly on 
compacts in Q to some limiting function F : Q — C, which moreover satisfies F(Z)) = 0, 
since F,,(Z)) = 0 for all n. Since uniform convergence on compacts implies convergence 
of the derivatives, we have that |F’ (Zọ)| = A. Since the F, are all injective, by Hurwitz’s 
theorem, F either is a constant function or is injective, but we know from Lemma 3.20 
that |F’(zg)| = A > 0, and hence F is not a constant and is therefore injective. 

Let z € Q. We know that |F(z)| < 1, since it is the limit of functions whose modulus 
is bounded by 1. However, F is holomorphic, and hence by the open mapping theorem, 
F(Q) is an open set contained in the closed disc {z : |z| < 1} and therefore is contained 
in the open disc D. Thus we have shown that F is an element of F, and the proof is 
complete. 


Proof of existence in Theorem 3.14. Take the element F € F, guaranteed to exist by 
Lemma 3.23, for which |F’(z))| = A. By composing F with a rotation if necessary, we may 
assume that F'(Zọ) is real and positive. By Lemma 3.21, F(z) must be surjective, which, 
together with the positivity of F' (Zo) and the properties implied by belonging to F, gives 
that F(z) is the biholomorphism whose existence was claimed. 


Summarizing, we proved the uniqueness claim from Theorem 3.14 in Section 3.7, 
and the existence claim was proved above. This finishes the proof of the Riemann map- 
ping theorem. 


3.10 Annuli and doubly connected regions 


The topic of conformal mapping does not end with the consideration of simply con- 
nected regions, where the problem of classifying complex regions up to conformal 
equivalence is now essentially settled (at least in principle) by the Riemann mapping 
theorem. To conclude this chapter, we give a brief taste of some of the interesting phe- 
nomena that arise when we try to classify conformal equivalence classes of regions 
that are not simply connected, starting with the next simplest case of regions that are 
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Figure 3.2: An annulus A(r4, r2). 


doubly connected. A region Q is called doubly connected if the complement C \ Q has 
two connected components.’ 

One important class of doubly connected regions are the annuli. For 0 < r; < r3, we 
denote 


Alri r) = {z 217, < z| < r} 


an open annulus centered at 0 with internal radius r, and external radius r, (Fig. 3.2). 
It turns out that unlike the situation for simply connected regions, these annuli are not 
all in a single conformal equivalence class, despite being homeomorphic. The precise 
classification is given in the next result, sometimes known as Schottky’s theorem. 


Theorem 3.24 (Conformal classification of annuli). Let 0 < rı < r, and 0 < p < pz. The 
annuli A(r;, r) and A(p;, p2) are conformally equivalent if and only if 


Ti -Pi 
h p2 


Proof. “If”: assume that + = ©. Then the mapz > fz = fazis a conformal equivalence 
between A(r4, r2) and A(p,, p>). 

“Only if”: this is the nontrivial direction. Assume that A (r, r2) and A(p;, p2) are con- 
formally equivalent. We start with a normalization that fixes the two inner radii at 1 
to simplify things a bit: denote u = r,/r, and v = p,/p,. Then A(1, u) is conformally 
equivalent to A(r;, r2) (by the scaling transformation mentioned in the “if” part), and 


3 More generally, Q is called k-connected if C \ Q has k connected components and finitely connected 
if it is k-connected for some k > 1. 
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similarly A(1, v) is conformally equivalent to A(p4, p2). Therefore A(1, u) and A(1,v) are 
conformally equivalent to each other. Let f : A(1, u) — A(1, v) be a conformal map. We 
can assume without loss of generality that f maps the inner boundary circle |z| = 1 to 
itself and maps the outer boundary circle |z| = u of A(1, v) to its counterpart |z| = vin 
A(1, v); otherwise, f maps the inner circle of A(1, u) to the outer circle of A(1, v) and vice 
versa, and in that case, we can get a conformal map that maps the inner circle to itself 
by replacing f by f(u/z) (the composition of f with the inversion z +> u/z, which is a 
conformal automorphism of A(1, 1). 

For each 1 < r < u, let y, denote the circular contour {|z| = r}, and let T, = f o y, de- 
note its image under the map f. The curve I, is a simple closed curve and hence encloses 
a well-defined region (see Theorem 1.26 and the discussion following it in Section 1.8), 
which we denote by Q,. The area enclosed by y, is, of course, mr”. The area of Q, is a 
continuous increasing function ofr, which we denote a(r). Two important observations 
about a(r) are that 


A_:=lima(r)=2 and A, := lima(r) = mv’, 
r\1 r7u 


since A_ and À, are simply the areas enclosed by the inner and outer boundary circles 
of A(1, v), respectively. 
Now we claim that 


a(r) = mr? foralll<r< u. (3.24) 


This would imply, by taking the limit as r 7 u, that mv’ = À, > mu’, so we would get that 
v > u. Reversing the roles of the two annuli would imply the reverse inequality v < u, 
and we would get that u = v, which is the claim we wanted, and the proof would be 
done. 

To prove (3.24), we note that a(r) can be evaluated as a contour integral using a 
complex-analytic version of Green’s theorem from calculus. Specifically, appealing to 
the result of Exercise 3.7, we see that 


27 27 
a(r) = T z z | (re) gre) a=! | Franf’ (rejet at. 8.25) 
I, 0 
Now let 
f(@)= 5 cZ" (3.26) 


be the Laurent expansion of f, which converges uniformly on compacts in the annulus 
1< |z| < u where f is holomorphic (see Theorem 1.65). Substituting (3.26) into (3.25), we 
get that 
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27 
a(r) = ; (Der r” e )(Z men pig r re dt 
0 
1 27 
= 3 È Mem an fe emnt dt=7 3 nie, r 2n 
nm n= oo 


0 


Taking the limit as r \ 1 gives that 


Now it follows that 
oO oO 
alr)- mr? =n 2 nlc, (2r" -z $ nje, = 7 X nic, (r” -1). 
n= oo n=—00 n=—co 


Since each summan4d in this last expression is nonnegative, we have that a(r) — mr? > 0, 
as Claimed. 


Having classified the annuli up to conformal equivalence, we state without proof 
an additional result that explains why the family of annuli plays a role in the theory of 
conformal mapping of doubly connected regions that parallels the role of the unit disc 
in the case of simply connected regions. For the proof, see [2, 6]. 


Theorem 3.25 (Conformal classification of doubly connected regions). The annuli A(1, p), 
p > 1, forma complete set of conformal equivalence representatives for doubly connected 
complex regions. That is, if Q c C is a doubly connected region, then Q is conformally 
equivalent to A(1, A) for precisely one value of A > 1. 


The number mg = = log(A), where A is the outer radius of the annulus to which Q 
maps, is called the conformal modulus of Q. Theorem 3.24 guarantees that if such a 
number exists, then it is unique, and the much stronger Theorem 3.25 guarantees that 
it exists. Thus mg is an important example of what is known as a conformal invariant. 
Much more can be said about mg, including a more direct way to define it that is intrinsic 
to Q and does not rely on the idea of conformally mapping Q to an annulus; consult the 
references mentioned above for details. 

The final component in the discussion of conformal equivalence classes of doubly 
connected regions is the identification of the conformal automorphisms of such a region. 


Theorem 3.26 (Conformal automorphisms of an annulus). The conformal automorphism 
group of the annulus A(r;, 12) is 


Aut(A(7,,12)) = {z = e®z : 0 <0 <27}u fz zo (Tul : 0<0<2n}. 
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That is, the automorphisms consist of the rotations z > e'°z, together with the composi- 
tions of the inversion map Zz œ> m with a rotation. 


Proof. Exercise 3.9. 


Suggested exercises for Section 3.10. 3.7, 3.8, 3.9. 
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Exercises for Chapter 3 


3.1 


3.2 


3.3 


3.4 
3.5 


3.6 
3.7 


3.8 


3.9 


If Q and Q’ are conformally equivalent with a conformal map g : Q > Q’, then 
describe an explicit group isomorphism between Aut(Q) and Aut(Q’). 

Let Z4, Z2, Z3, Wy, W2, W3 be elements of C. Prove that there is a unique Möbius trans- 
formation mapping z; to w; for j = 1, 2,3. 

Prove that besides the singleton conformal equivalence classes {C} and {C} de- 
scribed above, any other conformal equivalence class XK is infinite and in fact con- 
tains an infinity of regions any two of which are not images of each other under an 
affine transformation z +> az + b. 

Prove Theorem 3.11. 

Show that the assumption of holomorphicity in Montel’s theorem (Theorem 3.16) 
cannot be removed; that is, the result properly belongs in complex analysis and 
does not have a real analysis analogue (at least not an obvious one). 

Show that the real analysis analogue of Hurwitz’s theorem is not true. 

The complex-analytic version of Green’s formula from multivariate calculus states 
that if y is a simple closed contour in the plane, then the area A enclosed inside y is 


given by 
A= ea zdz. 
2i 
y 


Show that this follows from the usual Green’s theorem in real-variable calculus. 
Prove that the statement of Theorem 3.24 is also correct under the relaxed assump- 
tion 0 < rı < r, and 0 < Pı < pz, which addresses also the case of “degenerate” 
annuli with an inner radius of 0 (that is, punctured discs). 

Prove Theorem 3.26. 


4 Elliptic functions 


The theory of elliptic functions is the fairyland of mathematics. The mathematician who once gazes 
upon this enchanting and wondrous domain crowded with the most beautiful relations and con- 
cepts is forever captivated. 


Richard Bellman, “A Brief Introduction to Theta Functions” (1961) 


4.1 Motivation: elliptic curves 


Elliptic curves are fascinating objects studied in complex analysis, algebraic geometry, 
number theory, cryptography, and other areas of mathematics. An elliptic curve € is the 
set of solutions to an algebraic equation of the form 


E: y =a? +b +cx+d (4.1) 


relating a cubic in x to a quadratic function of y, where the coefficients (and solutions) 
are assumed to be elements of some field F, such as the rationals, reals, complex num- 
bers, or a finite field. It is often helpful to assume further that the curve is nondegen- 
erate, that is, that the cubic polynomial on the right-hand side of (4.1) has no multiple 
roots (see Section 4.11 for a related discussion). 

To study elliptic curves, it is helpful to first bring equation (4.1) to a simpler canon- 
ical form, usually written as 


E: y=40-gx-g, (4.2) 


through a standard change of variables; I skip the details of such a reduction. From here 
on, we will take (4.2) as the definition of an elliptic curve. 

A beautiful and surprising fact about elliptic curves that holds the key to many of 
their amazing properties is that they form an abelian group in a natural way. The group 
operation, denoted as a kind of “addition” operation P ® Q for two points P = (x4, y1) 
and Q = (X2,y2) on the curve, can be defined algebraically using a messy and strange 
formula that you would never think to guess directly. However, the formula has a sim- 
ple geometric interpretation, which is very easy to explain: the idea is that to compute 
P @ Q, you find the intersection point R = (x3, y3) of the line passing through P and 
Q with the curve (other than the points P and Q themselves) and then reflect R in the 
y-coordinate to define P ® Q = (X3, -y3); see Fig. 4.1. The fact that this construction is 
well-defined is tied to the subtle fact that a generic straight line intersects the curve at 
precisely three points. (I use the word “generic” because there are also technicalities 
involving degenerate cases where the line is tangent to the curve, which means that we 
have to be careful in interpreting this definition for a “doubling” operation P + P, or 
where one of the three intersection points is not actually there, in which case we add 
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Figure 4.1: An elliptic curve and the group addition law, visualized here for the curve y? = x? —x + + over 


10 
the real numbers. 


an additional “point at infinity” to serve in its place. I ignore such technical issues in the 
current informal discussion.) 

Taking the above geometric construction, we can work out by an explicit calculation 
that the algebraic expression for the coordinates of the result P ® Q = (x3, —y3) of the 
group addition of P and Q described above in geometric terms—again, in the generic 
situation—are given by the supremely unintuitive formulas 


2 
1 = 
x= (2: a ) X - X» (4.3) 
3 
Age 1 (=>) 
3 4X4- 
a OGY = XZY2) — 2G V2 — XBV1) + 3X1X204Y2 — X271) (4.4) 
(x, — X2)8 


It is far from clear why these formulas should define an associative operation, let alone 

a group law (at least the fact that the operation is commutative is easy to see). Even for 

the geometric construction, associativity requires some effort to explain (see [62, Ch. 1]). 
All of this raises many intriguing questions about elliptic curves in the specific con- 

text of curves defined over the complex numbers: 

1. Where does the group structure of elliptic curves “really” come from? That is, is 
there a conceptual way of thinking about them that makes it easy to see that such a 
group addition law should exist and that makes it possible to avoid the need for a 
cumbersome calculation to verify that (4.3)-(4.4) define a valid group operation? 

2. What does an elliptic curve look like topologically? 
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3. Can we classify all elliptic curves up to conformal equivalence as Riemann surfaces? 
That is, how do we determine when two elliptic curves are conformally equivalent, 
and how do we parameterize the conformal equivalence classes of elliptic curves? 

4. What additional roles exist for elliptic curves within complex analysis? What other 
topics or problems do they relate to? 


It turns out that all these questions and more can be answered by studying a certain fam- 
ily of meromorphic functions in the complex plane, called elliptic functions or doubly 
periodic functions. In fact, all members of the family can be obtained from a single 
function, the so-called Weierstrass g-function, denoted g(z), along with its derivative 
o' (z); and the map Z + (g(z), o' (Z)) gives a convenient parameterization of the elliptic 
curve €, which does much to explain what the elliptic curve and its group law “really” 
look like. 

The situation is analogous to what happens in the case of a much simpler group 
arising from an algebraic equation, the circle group 


S! = {x y) eR? : x? +y =I}. 
There too we have an abelian group “addition” law m given by 


(X11) B (X2 Y2) = (X1X2 — Vo Xz + X21). 


Although this formula can be easily verified to satisfy the properties of a commutative 
group operation through a purely formal calculation, to the uninitiated encountering it 
for the first time, the reason why such a group law exists may appear mysterious. For- 
tunately, there exists a “circular function” C : R > R that has the following properties: 
1. The map g(t) = (C(t), C'(t)) maps a real number to an element of Sy 

2. ọ(t+ s) = p(t) B (s) (that is, g is a group homomorphism from (R, +) to (St, m)). 

3. g(t + 27) = ọ(t), that is, ø is periodic with period 27; equivalently, its kernel as a 

group homomorphism is the additive subgroup 27Z of R. 


These properties taken together imply that o induces (by the first isomorphism theorem) 
a group isomorphism between the quotient group R/(27Z) with “ordinary” addition 
of real numbers (which in the quotient group becomes “addition modulo 27”) on the 
one hand, and S! with the “exotic” addition law m on the other hand. That is, the cir- 
cular function C(t) and the map 9 derived from it “linearize” the group operation and 
make it apparent that the circle group is topologically a real interval with its two ends 
glued together (that is, a circle), with the group operation being addition modulo 27. Of 
course, you may have realized by now that the “circular function” is nothing more than 
the familiar cosine function C(t) = cost. So in this point of view the cosine function 
and its derivative can be thought of as gadgets that help us understand the algebraic 
and topological structure of the circle group by parameterizing it in terms of a group 
that is easier to understand. As we will see, the situation with elliptic curves and the 


4.2 Doubly periodic functions —— 149 


use of the elliptic functions g(z) and g’(z) to parameterize them is quite similar. Also, 
as happens with the case of the trigonometric functions, the functions we construct out 
of this group-theoretic motivation will end up being useful for many other things. 

We now proceed to make precise these somewhat vague notions in a way that gives 
substance to the analogy described above. This will lead us to many new and beautiful 
ideas that will take us far beyond the familiar realm of trigonometric functions. 


4.2 Doubly periodic functions 


The cosine and sine functions in the example discussed above are periodic functions 
of a single real variable. We now double the dimensions and look for a meromorphic 
function of a complex variable that is “periodic” in two different directions in the plane. 
Such a function is called a doubly periodic function or an elliptic function. Formally, 
we say that w € C is a period of a meromorphic function f : C > C if f(z + w) = f(z) 
for all z € C. The set of periods of f (z) is denoted A, and is easily seen to be an additive 
subgroup of C. We say that a meromorphic function f is doubly periodic if A; contains 
two nonzero elements w4, w, that are linearly independent when considered as elements 
of a vector space over the real numbers (this is equivalent to saying that the complex 
number w/w; is nonreal). Trivially, if f, g are doubly periodic with the same linearly 
independent periods w4, w, then so aref + g, fg, r and the derivative f”. 

Note that the constant functions have every complex number as a period. This illus- 
trates the fact that the pair w4, w, of complex numbers attesting to the doubly periodic 
nature of a function f is not unique. To understand the less trivial scenario of a func- 
tion f that is doubly periodic but not constant, observe that in that case Ay must be a 
topologically discrete additive subgroup of C, for otherwise f can be seen to be constant 
by the uniqueness theorem for holomorphic functions (Corollary 1.36 on p. 42), since it 
takes the same value on a set of points with an accumulation point. It then follows (see 
Exercise 4.1) that Ay must be of the form w,Z + w,Z with nonzero numbers w4, wz that 
are linearly independent over R; that is, Ay is a discrete rank-2 subgroup. A subgroup of 
C of this form is called a lattice. The subgroup A; of periods of a nonconstant doubly 
periodic function f is called its period lattice. 

Iff is a nonconstant doubly periodic function with Ay = w,Z+W,Z, then we say that 
W1, W, form a fundamental period pair for f. Not all pairs of periods are fundamental: 
for example, if w4, w, is a fundamental period pair, then 2w,,2w, is a pair of periods, 
which, while it attests to f being doubly periodic according to the above definition, is 
not fundamental since 2w,Z + 2w,Z is a proper sublattice of Ay. On the other hand, 
a nonconstant doubly periodic function has infinitely many fundamental period pairs, 
since it is easy to see that the representation w,Z+w,Z ofa lattice is far from unique; for 
example, WZ +W.Z = (W,+kw.)Z+W,Z for any k € Z. Amore precise characterization 
of when two pairs (w,, W) and (w4, w) generate the same lattice is given in the following 
lemma. 
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Lemma 4.1. Let L = @,Z+W,Z and L' = w\Z + WZ be lattices. Then L = L' if and only 
if w; and w}, can be represented as 


Wy = AW, + DW, (4.5) 

w, = CW, + dwy, (4.6) 
where (42) is a 2 x 2 invertible matrix with integer entries, that is, a,b,c,d € Z, and 
ad — bc = +1. 


Proof. Proof of the “if” claim: assume that w; and w, have the form (4.5)-(4.6) with 
a,b,c,d € Z, ad- bc = +1. Then w}, w, € WZ + WZ. This clearly implies that L’ ¢ L. For 
the reverse containment, invert relations (4.5)—(4.6) to see that 


d u! b 1 
ad-bc ! ad-bc 
Ws = C gyt 

2 ad-bc ! ad-bc 


W1 = 


I 
W3, 


which, because of the assumption that ad-bc = +1, is a representation of the form (4.5)— 
(4.6) with coefficients satisfying the same conditions, but with the roles of the pairs 
(w1 Wy) and (w3, w,) reversed. Therefore L ¢ L’, and altogether we have shown that 
L=L'. 

Proof of “only if”: assume that L = L’, that is, 094Z + WZ = w{Z + w,Z. In partic- 
ular, W1,W. € WJZ + WZ, and wW, € Z + oZ. It follows that there exist integers 
a,b,c, d, a, P, y, 6 such that 


1 1 1 
w = aw +bw, %4 = AW, + pws, 


W = CW, + dW, Ww = Yu + SW}. 


Thus we have representation (4.5)-(4.6) with integer coefficients a,b, c,d. Moreover, 
since the matrices (44) and Gs) are inverse to each other and have integer en- 
tries, their determinants are also mutually reciprocal integers, so we must have that 


ad — bc = +1. 


A doubly periodic function f with a fundamental period pair w4, w, is determined 
uniquely by its values on the parallelogram 


P, (W1, W2) = {Zo + ty + SW. : 0 <t,s <1}, 


where Zo € Cis an arbitrary point. This is geometrically obvious, since if we denote by 
L = wZ+wZ the period lattice, then C is tiled perfectly by nonoverlapping L-translates 
of P, (w1, 2) (that is, shifted copies of the form w +P, (w1, w2) with w € L), and the value 
of f(z) for z in some L-translate w+ P, (w1, w) reduces by periodicity to the shifted value 
f(z-w), which is in P, (w1, w2). We refer to P, (w1, w2) as a fundamental parallelogram 
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Zotw /  __ 
= 7 
A 
/ 
y 
Zo 
Zt, 


Figure 4.2: A fundamental parallelogram P; (w1, w2) and its L-translates. 


for f; see Fig. 4.2. Note that the fundamental parallelogram depends on the choice of a 
fundamental period pair, so the choice of a parallelogram contains some arbitrariness 
in the same way that the choice of a fundamental period pair is arbitrary. Moreover, the 
additional (also arbitrary) parameter Zg allows us to specify the “origin” of the paral- 
lelogram; it is convenient to have that extra degree of freedom to avoid slight technical 
complications in some of the results below. 


Suggested exercises for Section 4.2. 4.1. 


4.3 Poles and zeros; the order of a doubly periodic function 


An obvious goal that we have is to construct some nontrivial doubly periodic functions, 
assuming that they exist.’ To motivate our construction and help convince you that it is 
in a sense the simplest one that has any chance of working, it would be helpful to under- 
stand what sorts of constraints exist on doubly periodic functions. The next few results 
show that there are in fact rather rigid constraints that such functions must satisfy. 


1 A tip for the reader: when you are reading a mathematical text and read a definition of a new and 
exotic class of mathematical objects, it is a good habit to always ask yourself right away: does such an 
object even exist? For, although in the case of a textbook the answer will usually be “yes,” when you are 
reading research papers on topics at the forefront of human knowledge, the answer will occasionally be 
far from clear even to the writer of the text and may well turn out to be “no.” Even for textbook-level 
mathematics, asking this question and spending a few minutes trying to answer it by yourself will often 
provide you with insight far beyond what a purely passive reading of the text can offer. 
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Proposition 4.2. There are no entire doubly periodic functions other than the constant 
functions. 


Proof. If f is entire and doubly periodic, then in particular f is bounded on the paral- 
lelogram {tw, + sw, : t,s € [0,1]}, which is a compact set. By periodicity, f(z) is also 
bounded on all of C and is therefore constant by Liouville’s theorem. 


We see from Proposition 4.2 that a nonconstant doubly periodic function f must 
have poles; by applying the same result to 1/f we see that f must also have zeros. Note 
that since the sets of zeros and poles of a holomorphic function are discrete, f can have at 
most finitely many zeros and poles in any fundamental parallelogram. To avoid certain 
technical issues, it is helpful to choose the “origin point” Zo for the fundamental parallel- 
ogram P, (w1, w2) in such a way that f does not have poles or zeros on the boundary of 
the parallelogram. We call a fundamental parallelogram with such a property generic 
(for the doubly periodic function f). It is easy to see that a generic fundamental paral- 
lelogram exists. 


Proposition 4.3. Let f be a doubly periodic function with fundamental period pair w4, w3. 
Let P, (%1, w2) be a generic fundamental parallelogram for f. Then 


f(z) dz = 0, (4.7) 


OP 29 (WW) 


where we consider the boundary oP, (w1, w2) as an integration contour oriented in the 
usual way in the positive mathematical direction. 


Proof. Decompose the contour T = ðP, (w1, w2) as the concatenation 


T=), +yz + Y3 + Ya 


of four contours yj, Y2, Y3, y4 corresponding to the edges of the parallelogram, where y4 
is the directed line segment from Zg to Z + 4, yz is the directed line segment from Zo + 
tO Zo + W4 + Ww3;, Y3 is the directed line segment from Zg + W, + w, tO Zo + Wo, and y; is the 
directed line segment from Zg + w, to Zo. By the doubly periodic property of f we have 


[ro az =~ | few aw, 


yı Y3 


since the change of variables w = Z + w, maps the integral on the left to the one on the 
right (including the minus sign). Thus, in the contour integral on T, the contributions 
from the integral over the two segments y; and y, cancel each other out. Similarly, by the 
change of variables w = z + w, we get a cancelation of the second and fourth segments: 


[ro az =~ | few aw, 


y2 Y4 
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so that in total we have 


byte) dz =| fede + | fede | fleydz + | fe) ae=0, 
T 


yı y2 V3 Va 


as claimed. 


Corollary 4.4. Under the assumptions of Proposition 4.3, the sum of the residues of f over 
the poles of f in the fundamental parallelogram P, (w1, w2) is Zero. 


Proof. By the residue theorem the integral on the left-hand side of (4.7) is equal to 277i 
times the sum of the residues. 


Corollary 4.5. A nonconstant doubly periodic function with a generic fundamental paral- 
lelogram P, (%4, w2) must have at least two poles, counting multiplicities, inside the par- 
allelogram. 


Proposition 4.6. Let g : C — C be a doubly periodic function with fundamental period 
pair w1, w, and a generic fundamental parallelogram P = P, (%1, w2). The sum of the 
orders of the zeros of g(z) inside P is equal to the sum of the orders of the poles of g(z) in 
the parallelogram, counting with multiplicities. 

Proof. Apply Proposition 4.3 to f(z) = zo, and note that by the argument principle 
(Theorem 1.48) the resulting integral is 277i times the number of zeros minus the number 
of poles of f in the interior of P. 


The last result enables us to define an important integer parameter associated with 
a doubly periodic function, called its order. This is made precise in the next result, which 
follows immediately from Proposition 4.6. 


Corollary 4.7. Let f be a nonconstant doubly periodic function. There exists a unique in- 

teger m > 2, called the order of f, with the following properties: 

1. f has exactly m poles, counting with multiplicities, in any generic fundamental paral- 
lelogram P, (w1, %2). 

2. For anya € C, f(z) assumes the value a exactly m times (that is, the function z =œ 
f(z) -a has m zeros), counting with multiplicities, in any fundamental parallelogram 
P, (w1, w2) that is generic for the doubly periodic function f (z) - a. 


Proposition 4.8. Let g : C — C bea nonconstant doubly periodic function with funda- 
mental period pair w, w. Let P = P, (w1, w2) be a generic fundamental parallelogram for 
g. Denote by Z,,...,Z, the zeros of g(Z) in P, counting multiplicities, and let w4, ...,;Wm be 
the poles of g(z) in P, counting multiplicities. Then the number 


Ms 


Zj- 
1 k 


Wg (4.8) 


J 


n 


1 


is a period of f. 
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Proof. Similarly to the proof of Proposition 4.3, we consider the contour integral 
'j 
d z8 2 dz, 
aP d 
which by the residue theorem is evaluated as 2mi()}_, 2; — X x-1 Wx). We use the same 
decomposition of the contour oP into four subcontours y;, 1 < j < 4, as in the proof of 
Proposition 4.3. Note that by the periodicity of g the images of each of the subcontours 


yı and y, under g(z) (denoted gy, and £ ° yy, respectively) are closed curves. Therefore 
we can use the same changes of variable as in the proof of Proposition 4.3 to write 


[EO ars | aw 


g(z) j gw) 
_ ( 2g'(z) (Z + Wy)g' (Z + w3) 
slg wea 
= —Wy oe e E = —W; - 2mim 


for some integer m equal to the winding number (see Section 1.13) of the closed curve 
g ° yı around 0. (Note that g © y, does not cross 0 because we chose P to be a generic 
parallelogram for g.) By similar reasoning, 


zg" (z) wg'(w) , ; 
$ a dz+ > zw) dw = w, - 2nin 


4 


with n € Z. Combining these results gives that the quantity in (4.8) is of the form —mw, + 
nw, for integer m,n and hence is a period. 


4.4 Construction of the Weierstrass g-function 


We are now ready to construct our first doubly periodic function, the Weierstrass 
g-function mentioned at the beginning of the chapter, which occupies a central place in 
the theory of elliptic functions. The construction is motivated by the following general 
principle that we see in many areas of mathematics: to construct an object with certain 
symmetry, it is often helpful to start with a nonsymmetric object and then symmetrize 
it by summing over its orbit under the action of the desired symmetry group. Our con- 
struction follows this template, although in practice we will need to deviate from it ina 
small way. In our situation the symmetry group is the group of translations Z > z+ w 
where w is a period, so this will involve an infinite summation over the elements of the 
period lattice L, which leads to slightly delicate issues of convergence. The next lemma 
clarifies what kind of summations are well-behaved enough to be useful. 
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The symbol g 


The mathematical symbol ø (pronounced similarly to the name of the letter “p,” or sometimes as “Weier- 
strass p” depending on the context) used for the Weierstrass elliptic function has an apparently unique 
status in mathematical notation as a symbol that is reserved for denoting one mathematical object and 
that object alone. Even the distinguished constants 7, e, and ido not enjoy such an exclusivity! The symbol 
g has its own code point in the Unicode string encoding system (U+2118) and its own escape string in the 
HTML standard (&weierp; ). Itseems rather generous of the developers of these computing standards to 
go to such lengths to please the fairly small group of mathematicians who use elliptic functions in their 
work. 

You may wonder how this quirky state of affairs came to be. It appears to have been little more 
than a historical accident. Both the function (z) and the notation for it were introduced by Weierstrass, 
who for this purpose used a stylized handwritten lowercase p bearing some resemblance to the Sütterlin 
alphabet used in handwritten German during that period in large parts of Prussia. Later authors ended 
up adopting not only Weierstrass’s choice of the letter but also his particular stylization of it, and thus a 
new symbol was born. For more details, refer to the online discussion [W18]. 


Figure 4.3: Weierstrass’ legacy in mathematical typography. 


Lemma 4.9. Let L c C bea lattice, and let B > 0. The infinite sum 


1 
> wi (4.9) 


weL 
w#0 


converges if.and only if B > 2. 


Proof. Exercise 4.2. 


Theorem 4.10 (The Weierstrass g-function). Fix a lattice L c C. There exists a unique 

meromorphic function, called the Weierstrass -function and denoted g(z), with the fol- 

lowing properties: 

1. g(z) is a doubly periodic function of order 2 with period lattice L. 

2. (z) has a pole of order 2 at every period w € L, with Laurent expansion around the 
pole beginning with 


1 
p(z) TE +0(z-w) (Zz —> w), (4.10) 


and no other poles. 
3. (z) is an even function. 


Moreover, the uniqueness already holds for a function satisfying the first two properties 
without assuming the even symmetry of p(z). 
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Proof. Proof of uniqueness: if g” (z) and p® (z) are two meromorphic functions satis- 
fying properties 1-2, then the function f(z) = p® (z) - p® (z) is doubly periodic and has 
no poles. By Lemma 4.5 it must be a constant. However, its Laurent expansion around 
z = 0 has the constant term 0 by (4.10), so in fact f(z) = 0 and g® (z) = g®(z). 

Proof of existence: we define go(z) as 


1 1 1 
pl) = -z+ > eo» 5) (4.11) 


This is a doubly infinite sum that can be written more explicitly in terms of a fandamen- 
tal pair of periods w4, w, as 


1 1 : 
pz) = -z+ È ( (z - mo - nw)? (Mmo + nw)? ) 


(mn)ez? 

(m,n)#(0,0) 
We claim that for any compact K c C, the series obtained from (4.11) by removing (if 
necessary) finitely many terms that have poles in K converges absolutely uniformly on 
K. This would show that (4.11) defines a meromorphic function on C with poles only at 
the points of L where individual summands of the series have poles. To prove the claim, 
fix a compact K c C. For z € K and w € L \ K, making the further assumption that 
|w| > 2|z| (which applies to all but finitely many terms in the series), we have 


2 


1 1j W — (Z - w)? _ | 2zw-z 
(z-w)? ol | w(z—w) | luz- w) 
21z] z1? C 
< 


=> = > 
lwl(lol — z)? |wl*(w4 - Iz)? lol? 


where C > 0 isa constant that depends only on K. The absolute convergence of the series 
now follows from Lemma 4.9. 
Next, observe that ø(z) is trivially even, since w € L if and only if w = -w € L,so 


1 1 1 
(ay z( z-o) z) 


w#0 


1 1 1 
ap (= +0!) ea) 


w'eL 
w#0 


1 1 1 
zt D(a pa) eo. 


w'eL 
w+0 


p-z) = 
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Next, to prove that (z) is doubly periodic, differentiate (4.11) termwise to get 


; 2 1 1 
o'z) =- 2 a Ee (4.12) 
w#0 

This infinite series is manifestly doubly periodic, as it is a true symmetrization with 
respect to the orbit of the L-action as discussed at the beginning of this section. (In fact, 
the expression } „ez (Z — w)? is probably the simplest possible formula we can write 
that defines a nontrivial doubly periodic function, except that the resulting function is 
of order 3 and thus not the “simplest” in the sense of having the smallest order possible.) 
Now let w € L, and denote g,,(z) := p(z + w) - (z). Since 


gL (Z) = p'(Z + w) - g' (z) = 0, 


that is, the derivative of g, is identically 0, we get that g,,(z) is a constant. Taking z = 
-w/2 gives g.,(Z) = p(w/2)-p(-w/2) = 0 since (z) is even. Thus g,,(z) = 0 and p(z +w) = 
p(z) for all z, which shows that g(z) is doubly periodic. 

Finally, note that g(z) has a pole of order 2 at z = 0 with principal part 5. After 
subtracting that principal part, we are left with 


1 1 1 
Hea D(a) 


weL 
w#0 


which is holomorphic in the neighborhood of 0, with the constant term in its Taylor 
expansion obtained by setting z = 0 in this expression, which gives 


1 1 
Dare) 


This proves the Laurent expansion (4.10) for the case z = 0, and the expansion around 
a general period w € L follows by periodicity. 


Note that the construction of the function g(z) depends on the choice of lattice L. For 
the time being, we regard the lattice as fixed, but later on, we will start caring more about 
this dependence, and it will be helpful to have a notation that emphasizes it. To that end, 
two common ways to denote the function g(z) associated with a specific lattice L are as 
g,(Z) or as p(z; L). At some point in the discussion, we will also replace L with a complex 
variable 7, called the modular variable, which parameterizes the space of lattices ina 
convenient way (see Section 4.14). In that context the notation (z; T) is used to denote 
the Weierstrass y-function including its dependence on both complex variables z and T. 


Suggested exercises for Section 4.4. 4.2. 
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4.5 Eisenstein series and the Laurent expansion of p(z) 


Let L c C be a lattice. Define the quantities G,, n > 3, associated with L by 


G= > =. (4.13) 
wEL\O0 

The G, are known as the Eisenstein series. As with the remark above about g(z), the 
value of G, depends on the lattice L, and when we wants to emphasize that, the notation 
G,,(L) can be used, or G,,(T) once we switch to the point of view involving the modular 
variable T. Note that G,,_; = 0 for k > 2 because of each term associated with w € L 
canceling out the term associated with —w. Thus the interesting Eisenstein series are 
the even-indexed ones G4, Gg, Gg,.... AS the next result shows, these series are closely 
related to the Weierstrass g-function. 


Theorem 4.11. The Laurent expansion of (z) around z = 0 is given by 
(z) = ae S (an + 1)GonioZ" = a 367" 45G ar 416 ee; (4.14) 
Z 4 z2 
Proof. Keeping in mind the standard Taylor expansion 
— =1+2x +3 +L +, 
(1-x)? 


we write 


1 1 1 1 1 1 
ea eo ga Green: z) 


w#0 w#0 
4 


5+ 5 A(o(2)+9(2) +4(2) +9(2) +) 


w#0 


= 5 +2 X TAEI $ TA ba a Bt 


weL\0 w weL\0 weL\0 


1 

= 5 + 2G3Z + 3Gyz" + 4G5z° + 5G6Z° +++- 
z 
1 

=r 3G4z" + 5Gez* + TGgz? +e, 
Z 


as claimed. Note that this calculation technically involved a rearrangement of terms in 
a double summation (the summation over w € L and the summation over the powers of 
z/w in each of the hypergeometric series 1/(1 — (z/ w)? being expanded), which needs to 
be justified. This is easy to do and addressed in Exercise 4.3. 


Suggested exercises for Section 4.5. 4.3. 
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4.6 The differential equation satisfied by (z) 


The first two Eisenstein series G, and Gg play a special role in the theory of the Weier- 
strass g-function and of elliptic curves. It is traditional to define rescaled versions of 
them, labeled g, and g3, by 


82 = 60G,, 83 = 140G¢. (4.15) 
The quantities g, and g} are known as the elliptic invariants. The role they play is 
hinted at by the following result (compare to (4.2)). 


Theorem 4.12. The function p(z) satisfies the nonlinear differential equation 


p' ZP = 4¢(2)° - Rol) - 83. (4.16) 


Proof. The idea is to consider the behavior of each term in (4.16) near z = 0. Using (4.14), 
we have 


(z) = 2 + 3G,z" + 5G,z* + O(z°), 
Z 


g'(z) = -5 + 6G,Z + 20G,z* + O(z°), 
Wy 2.5 EA -80G + 0(2°) 
g 75 z2 6 > 


3_ 1 9G, 2 
p(z) = atz + 15G; + O(z^). 
We see that by taking an appropriate combination of p' (z)’, p(z), and p(z)? we can cancel 
the pole at z = 0 (and hence all the poles throughout the complex plane, since all of the 
functions involved are doubly periodic with poles only at periods). Specifically, we have 
the Taylor expansion 


@' (z)* - 4p(Z)}? + 60G,e(z) = -140G, + O(z) (4.17) 


around z = 0. This is a doubly periodic function without poles and therefore a constant 
by Proposition 4.2. The value of the constant must be equal to the constant coefficient 
on the right-hand side of (4.17), namely -140G, = —g3. Thus the relation g' (z)? —4g(z)° + 
§(0(Z) = —¥3 holds as an identity of meromorphic functions, proving (4.16). 


Corollary 4.13. The function (z) also satisfies the second-order differential equation 


g" (z) = 6e(z)? - 58 (4.18) 


Proof. This follows immediately from (4.16) by differentiating both sides and dividing 
by 2¢'(z). 
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4.7 Arecurrence relation for the Eisenstein series 
Starting from the differential equations (4.16) or (4.18) and comparing Taylor coefficients 
on both sides, we get interesting identities relating the different Eisenstein series. For 
example, the coefficient of z on the left-hand side of (4.16) is 
-2.-2-42G, + 36G} = -168G, + 36G%, 

whereas the coefficient of z? of the expression on the right-hand side of that equation is 

4-3- 7Gg +4- 3-3- 3G; — 60 - 3G} = 84G, - 7264. 
Equating the two and simplifying give the identity 


Ge = 56. (4.19) 


Similarly, inspecting the coefficients of z* and zê on both sides of (4.16) gives two addi- 
tional identities of this type, namely 


5 


Gio = Jg 8466 (4.20) 
1 2 


The above idea can be exploited systematically by extracting the coefficient for any 
power z2”. In the general case, this results in a recurrence relation for the Eisenstein 
series. 


Proposition 4.14. The Eisenstein series can be computed recursively starting with the two 
initial values G4, Gg. Specifically, for any k = 4, we have the recurrence relation 


3 k-2 


ae T La 1) (2k — 2j - 1)GzjGxp-j)- (4.22) 


Gox 


Proof. Expand both sides of (4.18) as a Laurent series in z using (4.14). For the left-hand 
side, we have 


Cc 
g"'(z) = = + 6G, + Xn + 1)(2n + 2)(2n + 3)Gon,42”. 
Z 


n=1 


For the right-hand side, 
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2 


1 i 2 
6y(z) - 38 = E + È (2k + 1)Go4422"" ) - 30G, 
k=1 


=6) fzor + 3)Gonsa 


n=1 

n-1 5 6 

+ ¥(%+1)(2n-j)+ Gyara |z ng za + 36G4 — 30G4. 
j=1 


Equating the coefficients of z”” in these expressions gives (4.22). 


An alternative method of proving (4.22) that does not rely on doubly periodic func- 
tions is explored in Exercise 4.6; see also Exercise 4.7 for further applications of this 
method. 


Corollary 4.15. All the Eisenstein series Gy,, K = 2, can be expressed as polynomials in G4 
and Gs with rational coefficients (that do not depend on the lattice L they are associated 
with). 


Suggested exercises for Section 4.7. 4.4, 4.5, 4.6, 4.7. 


4.8 Half-periods; factorization of the associated cubic 


Let %4, w, be a fundamental period pair for our fixed lattice L. Denote by v4, vz, vz the 
numbers 


1 1 1 
Vy = hats Vo = hata V3 = z0 + W3), (4.23) 
which we refer to as the half-periods associated with the fundamental period pair 


W1, Wo. 


Lemma 4.16. The function g'(z) is a doubly periodic function of order 3. Its zeros in any 
fundamental parallelogram P, (w1, w2) that is generic for g' (z) are the unique three points 
in the parallelogram that are congruent modulo the lattice L to the half-periods v4, v3, V3, 
respectively, and they are all simple zeros.” 


Proof. We know that g’ (z) is of order 3 since its poles are the periods, and each one is 
of order 3 (the principal part is -2/(z — w)°; see (4.12)). Thus there are precisely three 
zeros counting multiplicities in a generic fundamental parallelogram, and if we identify 
three distinct zeros in such a parallelogram, then they are necessarily all simple. Now 
recall that ¢(z) is an even function, so g’(z) is odd. We also know that the values g' (vj) 
of g' (z) at the half-periods are finite numbers (that is, each vj is nota pole of g'(z)), since 


2 We say that two complex numbers a and b are congruent modulo L if a -b e L. 
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g'(z) only has poles at periods. Combining these observations we see that for any of the 
half-periods v,, 


@' (vj) = -@'(-v)) = -@! (v; + 2v) = -¢'(y)). (4.24) 


Thus g’ (v;) = 0 for j = 1,2,3, and in any generic fundamental parallelogram P, (w1, w2), 
the three zeros of ¢’(z) will be those three points that are congruent to v4, V>, v, modulo L. 


The values of g(z) at the half-periods are also important. We denote them by 
€1, €2, €3, that is, 


1 1 1 
ey = (51) @ = (5%) e; = o( Fu + o) ): (4.25) 


Lemma 4.17. The numbers e,, ey, e, are distinct and are the three roots of the cubic poly- 
nomial 4x? — gx — g3 (where g, and g, are the elliptic invariants defined in (4.15)), that is, 
we have the factorization 


4x? = gX — By = A(x — e1)(x — ey)(X — e3). 
Proof. If we denote h(x) = 4x° — gx — gs, then, by (4.16), 
hle) = h(e(vj)) = 4e(v))? -Rej -8 = (ep) = 0. 


Thus e;, €,, e3 are zeros of h(x). It remains to show that they are distinct. Assume by 
contradiction that ej = ex for some 1 < j < k < 3. This would mean that the function 
p(z) - ej has a zero of order at least 2 at z = vj (since go’ (vj) = 0 by Lemma 4.16) and also 
a zero of order at least 2 at z = vg, counting multiplicities. So in total (z) — e; would 
have at least 4 zeros in the fundamental parallelogram P)(w,, w,). This contradicts the 
fact that ¢o(z) is of order 2, and the proof is finished. 


The definitions of e4, e, and e, makes it seem like they are dependent on the choice 
of a fundamental pair w4, w. In fact, when regarded together, they depend only on the 
lattice itself, as the next result shows. 


Corollary 4.18. The numbers e,, ey, e3, considered as an unordered triple of numbers, are 
independent of the choice of fundamental period pair w4, w. That is, if w, w, is another 
fundamental period pair for L and e;, e, e, are the numbers associated with it analogously 
to €}, €>, €z, then 


1 I I 
{eis €% est = {21 €z, e3}. 


Proof. The e; are the roots of the cubic polynomial 4x? — gX — g, whose coefficients do 
not depend on the choice of fundamental pair. 


4.9 ¢(z) and g’ (z) generate all doubly periodic functions —— 163 


4.9 p(z) and g’(z) generate all doubly periodic functions 


We say that a function f is L-periodic or periodic with respect to L if any w € Lisa pe- 
riod of f. Our general discussion of doubly periodic functions earlier in the chapter mo- 
tivated and complemented our explicit construction of the Weierstrass g-function, but 
it seems desirable to give an explicit way to generate all doubly periodic functions with 
respect to a fixed lattice L. The next two theorems give an elegant solution to this classi- 
fication problem, which highlights the central role played by the Weierstrass g-function 
in the theory of doubly periodic functions. 


Theorem 4.19. Let L c C be a lattice. The set of even meromorphic functions that are 
periodic with respect to L coincides with the set of functions of the form 


f(z) = R(e(z)), (4.26) 


where R(w) is a rational function. 


Proof. Iff(z) is of the form (4.26), then clearly f(z) is even, meromorphic, and L-periodic. 
Conversely, let f(z) be even, meromorphic, and L-periodic. Assume that f(z) is non- 
constant, since otherwise there is nothing to prove. Fix a fundamental parallelogram 
P = P, (w1, w) that is generic for f; as an extra precaution, choose this P in such a way 
that it does not contain any points of L on its boundary (it is easy to see that this is pos- 
sible). Now define the even doubly periodic function 


_ Tia @) - ga) 


m > (4.27) 
Ika ((z) = ¢(b;)) 


gz) 


where a4, .. -An b1». - -, Dm are some points in P that will be specified shortly. The plan for 
the proofis as follows: we will find values for these points for which g (z) defined by (4.27) 
has the same zeros and the same poles in C \ L as f(z) (counting with multiplicities). We 
will then show that this property implies that f(z) = cg(z). Thus f(z) would be of the 
form (4.26), and the claim would be proved. 

To show that points a4, . .. , An, D,,...,D,, with the desired properties exist, consider 
the zeros first. The key property we need is the following claim: if the list of zeros of f(z) 
in P that are not elements of L, counting with multiplicities, consists of points c),...,c,, 
then v = 2n is an even number, and we can order the points in pairs c,;_;,C,; so that for 
each 1 <j < n, Cy_, is congruent to —c,; modulo L (that is, cy;_; + Cy € L). To prove this, 
let a be any of zero of f(z) that is not in L, and let u denote its order. We consider two 
cases: first, if a is not a half-period, that is, a is not congruent to -a modulo L, then since 
f (Z) is even, -a is also a zero of f(z) (and of the same order as a), so the list of zeros that 
are not in L has a number £ € P that is congruent to -a modulo L, is distinct from a, and 
appears in the list of zeros the same number u of times as a does. Thus we can pair up 
the u appearances of a with the u appearances of f} as required. 
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Next, consider the case where a is a half-period. In that case, we claim that the mul- 
tiplicity u of a as a zero of f (z) is an even number, so the required pairing would simply 
be u/2 pairs of a, a. The justification for this claim is that a being a zero of f(z) of order 
u means that 


f@=f'@=f'@=~--sf""@=0, (#0. 


However, we also know that f is even, and therefore any derivative f 2D (z) of f of even 
order is also even, and any derivative f(z) of f of odd order is an odd function. Then 
by a calculation similar to (4.24), taking into account that 2a € L, we get that 


| ae (a) = fF a) = fPD a+ 2a) = -fD (a), 


whence f C-D Ca) = 0 for j 21. Since f” (a) + 0, u must be even. 

Having shown that the zeros c4,..., C, can be matched in the way claimed above, 
we now define the numbers a4, ..., an by a; = Cj, 1 < j < n, that is, we include in the 
list a4, .. . , A, a single representative from each pair Cy;_;, Cy. The numbers b4, ..., bm are 
now defined by repeating the same construction as with the zeros but for the function 
1/f instead of f. 

We defined the numbers a4, .. ., an and b4, .. ., bm. They were all chosen as elements 
of P \ L, so that plaj) and ¢(b,) are all finite complex numbers; thus the right-hand side 
of (4.27) is a well-defined expression. 

We now claim that g(z) has the same zeros and poles as f(z) in P \ L, counting mul- 
tiplicities. Let a € P \ L be a zero of f(z) of order u. Denote by £ the unique point 
in P for which £ is congruent to -a modulo L. Again, we consider the cases where a 
is a half-period or not a half-period separately. If a is not a half-period, then by our 
construction the list of numbers a,,...,a, includes u numbers y that are equal to ei- 
ther a or p. Each of them corresponds to a factor in the numerator of g(z) of the form 
p(z) - p(y) = p(z) - g(a) = p(z) - (p), which is a function that has simple zeros at a 
and at £ and no other zeros or poles in P \ L. None of the other factors in the products 
that make up the numerator and denominator of g(z) have a zero or pole at a. Thus the 
order of the zero of g(z) at a is u. 

In the case where a is a half-period, we have a = £$. The function h,(z) = p(z) - g(a) 
has a double zero at a (the point z = a is a zero of h, of order at least 2, since both h, and 
its derivative vanish there, but h, is a doubly periodic function of order 2, so the order 
of the zero is exactly 2) and no other zeros or poles in P \ L. This function was included 
u/2 times in the product in the numerator of g(z), and again, none of the other factors 
in the products in the numerator and denominator of g(z) has a zero or pole at a. So in 
this case, we also have shown that the order of the zero of g(z) at a is u. 

We showed that the zeros of g(z) in P \ L match the zeros of f(z) in P \ L, with the 
same multiplicities. Applying the same reasoning to the poles (that is, comparing the 
zeros of 1/f (z) with those of 1/g(z)) shows that the poles of g(z) in P \ L match the poles 
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of f(z) in P \ L and their multiplicities. The conclusion is that the function f(z)/g(z) 
is a meromorphic L-periodic function all of whose zeros and poles are elements of L. 
However, such a function must be constant: for otherwise, if it had a zero of any order 
at z = 0, then by periodicity it would have a zero at any w € L and therefore no poles, 
and similarly, ifit had a pole at z = 0, then it would have a pole at all w € L and therefore 
no zeros. Since we know that any nonconstant doubly periodic function must have both 
zeros and poles, neither of those situations can occur. 

To summarize, we proved that f(z) coincides with the function cg(z) for some con- 
stant c, as claimed. The proof is complete. 


Theorem 4.20. Let L c C bea lattice. The set of meromorphic functions that are periodic 
with respect to L coincides with the set of functions of the form 


f (2) = R(e(z), ¢'(z)), (4.28) 


where R(é, Ç) is a rational function in two variables. 


Proof. Iff is of the form (4.28), then it is meromorphic and L-periodic. Conversely, given 
a meromorphic and L-periodic function f, decompose f(z) in the standard way as a sum 
f(z) = g(z) + h(z) of an even function g(z) and an odd function h(z), where 


fO -fz2) 


h(z) 5 


g(z) = fe) tf (=2) 
Now note that g(z) is an even L-periodic function and therefore by Theorem 4.19 can be 
represented as a rational function in ¢(z). Similarly, h(z) is an odd L-periodic function, 
which means that h(z) /¢! (z) is even and L-periodic. Therefore h(z) can be represented 
as o' (z) times a rational function in g(z). Combining the two representations for g(z) 
and h(z) gives the desired representation for f (z). 


Suggested exercises for Section 4.9. 4.8, 4.9. 


4.10 g(z) as a conformal map for rectangles 


Among the remarkable properties of the Weierstrass g-function, it provides a solution 
to the natural geometric problem of conformally mapping a rectangle onto a half-plane. 
This happens in the case where the associated lattice L is a rectangular lattice, that is, 
when it is of the form L = Z + iAZ for a real parameter A > 0. The precise result is as 
follows. 


Theorem 4.21. Let A > 0, let L = Z + iAZ be a rectangular lattice, and let (z) = p(z; L) 
be the associated Weierstrass elliptic function. The map (z) restricted to the rectangle 
R= (0, 1) x (0, 14) is a conformal map from R to the lower half-plane {z : Im(z) < 0}. 
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Proof. Denote by R’ the closed rectangle [0, 1] x [0, 1A]. First, note that the restriction 
of (z) to R' is injective. Indeed, a = 0 is the unique point in R’ that gets mapped to oo. 
On the other hand, if a € R’ \ {0}, then the function g(z) - g(a) has simple zeros at a and 
1+ iA - a and (since g(z) is a doubly periodic function of order 2) at no other points in 
the fundamental parallelogram P(1, iA). When a = (1+ iA)/2, those two points coincide, 
and for any other a € R’ \ {0}, the second zero 1 + iA — a is not in R’. This proves the 
injectivity claim. It follows that g(z) maps R conformally to its image (9). 

To understand why the image (Q) is the lower half-plane, it is helpful to examine 
the behavior of p(z) as one traverses the boundary OR of the rectangle in an anticlock- 
wise direction, starting at 0. Denote e, = (1/2), e, = g(iA/2), e3 = g((1 + iA)/2) as 
in (4.25). We claim that OR is mapped under g(z) to the real line (including the point 
at infinity, the image of 0). More specifically, the numbers e,, €z, e3 have the ordering 
-00 < €, < €3 < € < oo, and as z moves successively along the four boundary edges 
(0, 1/2], [1/2, (1 + iA)/2], [(1 + iA)/2,iA/2], and [iA/2, 0],? the image g(z) descends from 
+00 to e, (the image of the first boundary edge), then from e; to e, (second boundary 
edge image), then from e; to e, (third boundary edge image), and finally from e, to -o0 
(fourth boundary edge image). 

This geometric picture is easily justified by the following list of simple claims. 

1. (z) takes real values on the segment (0, 1/2). 


Proof. This is immediate from (4.11). 
2. (z) is decreasing on (0, 1/2]. 


Proof. The derivative g’(z) is nonzero everywhere in R’ except at the three points 
1/2, (1+ iA)/2, and iA/2. Thus g(t) regarded as a function of a real variable t € (0, 1/2] 
is monotone. It must be decreasing rather than increasing, since the Laurent expan- 
sion (4.10) around w = 0 implies that 


lim g(t) = +œ 
(in the sense of ordinary real limits from calculus). 


3. (z) takes real values on the segment [1/2, (1 + iA)/2]. 


Proof. By representation (4.12) for the derivative of p(z), we have 


ft it) 1 
=+it}=-2 ` 
(3 A ae 


3 Here we use the notation [a,b] to denote the directed straight line segment connecting a point a to 
another point b. Similarly, the notations (a, b), (a, b], and [a, b) are further used to denote open and half- 
open straight line segments, consistently with the usual notation for intervals from real analysis. 
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=-2 

error i ea 

= -2 f 
22a Cmi- An m- + a 


This represents o' G + it) as a sum of terms of the form 


1 1 -y(3x? = y?) 


a+? x+y? Oe +3 


over pairs x = 1/2 - mandy = t — An (both real numbers if t is assumed real). Thus 
we see that g' (z) takes imaginary values on the segment [1/2, (1 + iA)/2]. Since we 
already know that ¢(1/2) is a real number, we get that 


1/2+it 
(1/2 + it) = e(1/2) + | o' (2) dz 
1/2 


is also real for 0 < t < A. 
p(z) is decreasing on the segment [1/2, (1 + iA)/2]. 


Proof. Again, from the knowledge of where g’(z) takes nonzero values we conclude 
that the function t + g(1/2 + it) is monotone for 0 < t < A. Again, it is not only 
monotone but in fact must be decreasing: if it were increasing, then ¢(1/2 + it) for 
0 < t < A would be a real number in (e4, oo). That is impossible, since as discussed 
above, ø(z) is injective on the closed rectangle R’, and the real numbers in (e4, co) 
were already shown to belong to the image of the interval (0, 1/2). 


Using similar arguments, it is not difficult to verify the following additional claims: 


5. 


6. 
7. 
8 


p(z) takes real values on the segment [(1 + iA)/2, iA/2]. 
p(z) is decreasing on the segment [(1 + iA)/2, iA/2]. 
p(z) takes real values on the segment [iA/2, 0). 

p(z) is decreasing on the segment [iA/2, 0). 


This completes the explanation about the mapping properties of (z) on the boundary 
of R. Now since g(z) maps the rectangle boundary to the real axis and is injective on 
R’, we see that R itself must get mapped either to the lower half-plane or to the upper 
half-plane. Appealing again to the Laurent expansion (4.10), we see that for z in R that is 
close to 0 (for example, z of the form e(1 + i) where e > 0 is small), (z) lies in the lower 


half-plane, so ¢(R) is the lower half-plane, as claimed. 


Fig. 4.4 illustrates how the Weierstrass g-function associated with the square lattice 


Z? can be used to conformally map a square to the unit disc. 
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Figure 4.4: For the square lattice Z? = Z + iZ, if we take p to be any conformal map from the lower half- 


plane to the unit disc, then the map z +> Ņ ° @(z) maps the square (0, ) x (0, ) conformally onto the unit 


disc. The figure shows the action of the map in the case where y(w) = -4 ue 


4.11 The discriminant of a cubic polynomial 


The discriminant of a complex polynomial p(z) = a,z" + --- + QZ + dy of degree n > 1 
is defined by 


Aaa || Gay. (4.29) 


1si<j<n 


where Z,,...,Z, denote the roots of p(z), counting multiplicities. Note that this defini- 
tion does not depend on the ordering of the roots. Trivially, p(z) has multiple (that is, 
nonsimple) zeros if and only if A, = 0. What in addition makes A, a useful quantity is 
that it is of the form lg multiplied by a symmetric polynomial in the zeros of p(z), 
and therefore, by a standard result from algebra, it can be expressed as a polynomial in 
the coefficients of p(z), providing an explicit criterion for checking if a polynomial has 
multiple zeros. For example, for a quadratic polynomial p(z) = az? + bz + c, we learn in 
basic algebra that A, = b° — 4ac. The derivation is trivial. 

If p(z) = 4z° -az -b is a cubic polynomial given in the “reduced” form we are using 
for our elliptic curves discussion, then the formula expressing the discriminant in terms 
of the coefficients a, b is less well known, and its derivation is a bit less trivial. 


Lemma 4.22. The discriminant of the cubic p(z) = 4z? — az — b is given by 


A, = 16(a° - 27b’). (4.30) 


p = 

We note that in some books, the discriminant of a cubic polynomial 4z° — az — b = 
A(z — 24)(z — 2)(Z — 23) is defined as 16(z, — z,)*(z, — 23)"(z_ — 23)”, which differs from 
our definition (4.29), the usual definition for general degree n polynomials, by a factor 
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of 1/16. For that alternative scaling, the correct formula would be Ap = a’ — 27b*. See 
also (4.35). 


Proof of Lemma 4.22. Denote the zeros of p(z) by z4, Z2, Z3. By comparing coefficients of 
powers of z in the equation 


p(z) = 42° - az - b = A(z - 24)(z - 2)(z - 23) 
we get the relations 


yy = Z1 + Z3 + Z3 = 0, 


U2 = Z122 + 2123 + Z9Z3 = -5, 
:= Z4ZoZ3 = l 
Hg := 242923 = T 
Next, differentiate p(z) to get that 
p' (Z) = 4(Z — Z1)(Z — Z2) + 4(Z — 24)(Z - Z3) + 4(Z - Z2)(Z - Z3), 
so in particular 


p' (z1) = 4(Z1 — Z2)(Z1 - 23), 
P' (Z2) = &(Z3 - Z1)(Z2 - 23), 


p' (Z3) = 4(Z3 - 24)(Z3 — Zp). 
Therefore 
Ap = —4p' (2,)p' (22)p" (23). 


On the other hand, p’(z) = 12z” — a, so we get that 


A, = -4(12z% — a)(12z3 - a) (1222 — a) 


p = 
= ~A[12°222323 - 12° a(ziz5 + ziza + 232%) + 120° (Zf + Z% + 22) - a°]. (4.31) 
In this expansion, we have that 
222 2 P 
Z{Z3Z3 = U3 = —, 4.32 
12223 = U3 = jg (4.32) 
Da eds oDi 2 2 _a 
Z1 + Z3 + Z3 = (Z1 + Z2 + Z3)“ — 2(Z1Z2 + 24Z3 + 2Z9Z3) = 0- 2l = = (4.33) 


7 


This also gives that 
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2 


a 
an Au; = A(Z4Z + Z4Z3 + Zola) 


= A(zizs + ne + zz) + 8Z1Z2Z3(Z1 + Z3 + Z3), 


which yields the relation 
22, 22, 22 a 
Z1Z3 + Z123 + Z323 = jg" (4.34) 


Substituting (4.32), (4.33), and (4.34) into representation (4.31) for Ay gives finally that 


a 3 3 2 
HB3 a ) = 16(a — 27b*), 


as claimed. 


4.12 The discriminant of a lattice 


Let L c C be a lattice, and let g,, g3 be the associated elliptic invariants defined in (4.15). 
The quantity 


A = g - 2783 (4.35) 


is called the discriminant of the lattice L. In the context of the theory of modular forms, 
which is the subject of the next chapter, it is called the modular discriminant. Note that, 
as we see from (4.30), A is simply the discriminant of the cubic polynomial 4z° — g5z - g3 
(with the different scaling convention mentioned after the statement of Lemma 4.22). 
By (4.29), (4.30), and Lemma 4.17 it can also be rewritten as 


A = 16(e, — 8)” (& — e3) (e, — e3)*, (4.36) 


where é;, €z, e3 are given by (4.25). We also get the following conceptually important re- 
sult. 


Corollary 4.23. The discriminant A of a lattice L is always nonzero. 


4.13 The J-invariant of a lattice 


Another important parameter associated with a lattice L is known as Klein’s J-inva- 
riant. It is defined by 


§2 | g 
A g3— 27g? 
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which evaluates to a complex number since A is never 0. Klein’s J-invariant plays an 
important role in the theory of modular functions and modular forms, and we will have 
more to say about it later; see Sections 5.9-5.10. 


4.14 The modular variable Tt: from elliptic functions to elliptic 
modular functions 


Up until now, we considered the Weierstrass elliptic function associated with a specific 
fixed lattice L and denoted it by (z), letting the dependence on L remain implicit in the 
notation. However, it turns out that there is much to gain from considering the lattice it- 
self as another variable the Weierstrass g-function and other related quantities depend 
on. Moreover, while a priori it might seem that “functions of a lattice-valued variable” 
are a cumbersome notion to attempt to study, it turns out that we can encode the depen- 
dence on the lattice in a natural way with a single complex variable, called the modular 
variable and denoted t. From this new point of view, the function g(z) (which, as we 
have also said, can sometimes be denoted go(z;L)) becomes a function of two complex 
variables, now denoted g(z; T). Historically, the functions that we now refer to as elliptic 
functions were known as elliptic modular functions to signify this double dependence 
on the variable z, with respect to which they are doubly periodic, and the variable q, the 
dependence on which has its own interesting flavor, captured by the term “modular.” 
This term seems to be mostly used in older textbooks. 

To explain the connection between L and q, note that our convention to represent 
lattices as L = w,Z + WZ involve certain degrees of freedom that are not interesting 
in the sense that they can easily be eliminated and play no further role in the analysis. 
First, the ordering of w4, w, is immaterial; that is, the ordered pair w4, w, represents 
the same lattice as w,, w. We can get rid of this double representation of lattices by 
considering the pair w4, w, to come ordered in such a way that the parallelogram with 
vertices 0, w4, W + Wy, w3 is “oriented in the positive direction.” Equivalently, this means 
that their quotient w,/w, lies in the upper half-plane. 

Second, lattices can also be scaled and rotated; that is, a pair w4, w, representing 
the lattice L = w,Z + w,Z can be replaced by w4 = àw, w, = Aw, for some scalar 
A + 0 to obtain the lattice L’ = w|Z + w,Z. Although L' are L are technically distinct 
lattices, from the point of view of complex analysis, they are equivalent in the sense 
that the Riemann surfaces C/L and C/L' are conformally equivalent via the scaling map 
Z + Az; meromorphic functions that are L-periodic are trivially in bijection with those 
that are L'-periodic; the Weierstrass -function associated with L is in a simple relation 
to the g-function associated with L’; etc. Formally, we say that lattices L, L’ related by 
L' = AL for some A + 0 are homothetic. The above remarks can be summarized as 
saying that our main interest is in understanding lattices up to the equivalence relation 
of homothety. 
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For this reason, we now define the modular variable 


wW 
pe 

Wy 
a parameter taking values in the upper half-plane H, and which we consider to be canon- 


ically associated with the lattice 
L: =Z+tZ. 


As remarked above, this lattice is equivalent via a rescaling operation as described above 
to the lattice 


L = WZ a WZ. 


With this notation, the original lattice L and fundamental period pair w4, w, need not 
play any further role in the analysis. 

As we will see in the next chapter, the transition to the parameterization of lattices 
using the modular variable t will reveal many additional layers of depth and beauty 
to the theory and open up a new complex-analytic area to explore, that of the modu- 
lar surface and various families of meromorphic functions that are associated with it, 
which are known as modular functions and modular forms. 


4.15 The classification problem for complex tori 


You might have noticed by now, or seen it pointed out somewhere, that the doubly pe- 
riodic functions we have been studying can be naturally identified with functions on 
a quotient space C/L in which we consider points z, z’ as equivalent if they are con- 
gruent modulo the lattice L. This quotient space (which is indeed a quotient group) is 
topologically homeomorphic to the torus T? = S! x St, a compact surface. It also comes 
naturally equipped with the structure of a Riemann surface, inherited from C (in this 
book, we will not discuss the formal details of how this structure is set up, but at an in- 
tuitive level, it is not hard to appreciate that quotienting by a discrete subgroup leaves 
the complex structure “locally” similar to that of a normal complex region Q), so when 
thought of in that way, we refer to it as a complex torus. The doubly periodic functions 
that are periodic with respect to the lattice L, which are the meromorphic functions on 
C that “respect” the equivalence relation of congruence modulo the lattice, can be seen 
from this point of view as simply meromorphic functions on the complex torus C/L. So 
the theory of doubly periodic functions is precisely the study of the complex-analytic 
structure of complex tori. 

This way of thinking takes us back to the discussion of conformal mappings from 
Chapter 3 and the problem of classifying complex regions, or in the current context Rie- 
mann surfaces, up to conformal equivalence. Each lattice L gives rise to its own complex 
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torus, but what can be said about how to decide when one complex torus C/L is biholo- 

morphic to another complex torus C/L'?* (Note that there is no hope for a complex 

torus to be biholomorphic to anything that is not topologically a torus, such as an ordi- 

nary complex region Q c C, since conformal equivalence is stronger than topological 

homeomorphism.) Thus we arrive at the classification problem for complex tori. This 

consists broadly of several related questions: 

1. First, what are necessary and sufficient conditions that two lattices L, L’ c C must 
satisfy for the biholomorphism relation C/L = C/L’ to hold? 

2. Second, can we find a nicely behaved set of representatives covering all conformal 
equivalence classes for the tori C/L, with each class being covered exactly once? 

3. Third, can this set of representatives be parameterized using a canonical “invariant” 
of some kind to make its description even simpler? (What this means exactly will 
become clearer later.) 


Before you continue reading, pause for a minute to think what you might expect a solu- 
tion to this classification problem to look like, keeping in mind some of the phenomena 
we discussed in Chapter 3, such as the Riemann mapping theorem and the classification 
of annuli and doubly connected regions up to conformal equivalence. 

We will have to develop some additional theory to fully answer these questions. As 
we will see, the answers are related to the theory of modular forms, discussed in the 
next chapter. For now, we can formulate an initial attempt at a solution that answers 
the first of the questions formulated above. The remaining questions are answered in 
Sections 5.5 and 5.11. 


Theorem 4.24 (Classification of complex tori: first part). Let L,L' c C be two lattices in 

the complex plane. 

(a) The complex tori C/L and C/L' are biholomorphic as Riemann surfaces if and only 
if the lattices L and L' are homothetic. 

(b) IfL,L' are given explicitly as 


L=04,Z+)Z, L' =0iZ+ WZ, (4.37) 


in terms of respective fundamental period pairs (w4, w), (w4, w3) for the two lattices, 
then the homothety condition in part (a) is satisfied if and only if 


w, _ aw, + bw 
w; Cw + dw 


for some a,b,c,d € Z such that ad — bc = +1. 


4 When talking about Riemann surfaces, it seems a bit more customary to use the term “biholomorphic” 
rather than “conformally equivalent”, although the two terms are generally regarded as synonymous. 
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Proof. (b) It is immediate from Lemma 4.1 that L and L’ given in (4.37) are homothetic 
if and only if 


w = A(aw, + Bw), 


wz = A(ya, + ôw) 


for some complex number A + 0 and integers a, p, y,5 € Z such that aô — By = +1. It is 
easy to see that this is equivalent to the condition described in the theorem. 

(a) We start by proving the “if” part of the claim. Assume that L’ = AL with a’ + 0. 
Define the map f : C/L — C/L’ by 


f(iz+L=dz+L', 


that is, the map taking the coset z + L in the quotient group C/L to the coset Az + L’ 
in the quotient group C/L’. We claim that f is well-defined (i.e., that the definition is 
independent of the choice of a member z of the coset). Indeed, if z4, Z are members of 
the same coset of C/L, that is, Z4 + L = Z, + L, then 


ÀZ1 +L! = Az, +AL =A(Z, +L) = A(Zy +L) = AZ +L', 


so Az, and Az, are in the same coset of C/L’. 

It is easily checked that this map also respects the Riemann surface structure of 
the quotient groups C/L and C/L’, that is, that it is holomorphic. Applying the same 
reasoning with the roles of L and L’ swapped, the map g : C/L’ > C/L defined by 


g(w+L') =A 'w+L 


is a well-defined holomorphic map of C/L' into C/E, and trivially g and f are inverse to 
each other, thus the two surfaces are biholomorphic. 

Now we prove the “only if” part, which is the less obvious part. Assume that C/L = 
C/L' (meaning that the two tori are biholomorphic), and let f : C/L — C/L’ be a biholo- 
morphism. We can assume without loss of generality that f maps the zero coset 0 + L to 
the zero coset 0 + L’ (otherwise, replace f with its composition with a translation map 
z+L' ++ z+a+L' fora suitable a). Motivated by the proof of the “if” part above, it 
seems natural to ask whether f can be represented as a map of cosets inherited from 
an “ordinary” complex-valued function of a complex-valued parameter. In other words, 
we look for an entire function f : C > C for which 


f(z+L)=f(z)+L' (4.38) 


for all z € C. Schematically, it is helpful to think of such f as the “solution” to the problem 
of completing the dashed line in the commutative diagram 
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cL —2—> cr 


where Q; : C > C/L and Qy : C > C/L' denote the quotient maps associated with the 
quotient groups C/L and C/L', respectively. That is, z and Qz are given by 


9,(2)=Z+L, Op(w)=we+L'. 


If you have studied topology or other areas of mathematics where such diagrams appear, 
you are probably aware that the question of when we can “solve” such an equation in 
the unknown map is a rather subtle one in general; in our particular situation, it will 
not be very hard, fortunately. If such f exists, it is often referred to as a lifting of f (with 
respect to the quotienting maps @,, @, that “descend” from the “upstairs” part of the 
diagram to the “downstairs” part). 

Now assume that such f can be shown to exist—we will prove this shortly. Since 
f(0+L) = 0+L’, we must have f(0) € L’, and again we may assume without loss of 
generality that f(0) = 0 by replacing f by its composition with translation w + w -f (0) 
if necessary. 

The function f is entire by assumption. We claim that it is in fact a conformal au- 
tomorphism of C. The reason is that if g : C/L’ — C/L denotes the inverse map to 
f, then the same assumption we made above about the existence of a lifting for f also 
implies that there exists a lifting for g, that is, an entire function g : C — C such that 
g(w + L') = g(w) +L for all w € C. Then it is easy to see that the fact that f and g are 
inverse to each other or, in other words, that f o g is the identity function, together with 
the normalization f(0) = 0 = (0), implies also that the composition f o Z of the lifted 
maps coincides with the identity function at least locally in a neighborhood of 0; and 
similarly for Z of. Therefore by analytic continuation in fact f og and gf both coincide 
with the identity function globally on all of the complex plane. Thus we see that f and Z 
are inverse maps, and thus f is an automorphism, as claimed. 

Now we can apply the classification theorem for automorphisms of the complex 
plane (Theorem 3.3) and conclude that f(z) is of the form f(z) = Az + b with À + 0. In our 
case, f(0) = 0, so b = 0 and f(z) = Az. In that case, for any w € L, we have 


L'=0+L' =f(0+L) =f(w+L) =f(w) +L’ =Aw+L', 


so Aw € L'. This proves that AL c L'. Applying the same reasoning to the inverse map 
Z(w) =f ‘(w) =A ‘w gives the opposite inclusion AL > L, so finally we get that L’ = AL, 
as claimed. 

It remains to prove the existence of the lifting f of f. The reason why it exists is 
fundamentally a topological one and has to do with the notion of a covering map. I will 
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sketch the argument, which is somewhat abstract and uses some background from the 
theory of Riemann surfaces, and then also provide a self-contained proof that manages 
to avoid any Riemann surface machinery. 

The abstract explanation is as follows. In a general version of this situation, visual- 


ized by the diagram 
o| b 


U —— V 


in which X,Y, U, V are Riemann surfaces and ọ : X — U and Ņ : Y — V are covering 
maps, a theorem from Riemann surfaces says that the lifting f is guaranteed to exist if 
the Riemann surface X at the top-left corner of the diagram is simply connected. (In that 
case, X is called the universal cover or universal covering space of U.) Fortunately, 
we are in precisely that scenario. So if you are familiar with that result, then the proof 
is complete, and no more effort is required. 

Now for the self-contained argument: the function f ° Øz is a holomorphic map from 
Cto C/L'. Let Zo € C. By the definition of the Riemann surface structure on C/L’, in some 
open disc U,, centered at Zo, this map is represented by an ordinary holomorphic map 
Zn : Up > C such that f ° 9, = Pr ° 8z, that is, f(z + L) = g, (Z) + L' for all z € U,. 

It is also easy to see that any other holomorphic map h : U,, — C representing f in 
such a way will have the form 


h(z) = 8, (Z) +0! (4.39) 


for some w’ € L'. This is because the assumption on g,, and h implies that h(z) - 8, (Z) € 
L' for any z € U,,, so (4.39) has to hold for some w’ € L' that might depend on z; but 
Z|» h(z)- Zz (z) is a continuous function of z, U,, is connected, and L’ is discrete, so in 
fact the w' has to be the same for all z € U}, 

Observe further that for any h as above, again by (4.39) we have h’ = g, that is, 
the derivative En (z) is actually independent of the choice of g, from the set of possible 
choices. By similar reasoning it is also easy to check that if Z),z, € C have the property 
that U, N U, + 0, then En and oy agree on U,, N U,,. We can therefore define a global 
(entire) function H : C — C such that Hiv, = a for each of the local representation 
functions gz,- 

Now let f : C > Che the primitive of H satisfying f (0) = 0 (guaranteed to exist by 
Corollary 1.25). We claim that f satisfies the claimed property (4.38) of being a lifting for 
f. This equation is true for z = 0 by definition. Moreover, assume that we already know 
that f(z) + L) = f (Zo) +L’ for some Zg € C. We claim that this implies the same property 
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f(z+L) =f(z)+L' forallz € U, (the open disc centered at Zo as above). This is because 
in that disc we can write 


FE +L) = F ° P2) = (Pr ° 822) = Pr (82, (2)) 


= ou(m) + fer aw) 


Zo 


= ou(m) + | Hon aw) 


= Pr: (8z, (Z0) + f (2) - f (Z0)) 
= Qi) (2z (Zo)) + pu F2) - pr (F(Z0)) 
= f(P1Z0)) + Pr F2) - pr FZo)) = pr F@) =f@ +1, 
where we use the fact that ø; is a group homomorphism (and use “+” to denote addition 


both in C and in the quotient group C/L’). 
The conclusion from the above discussion is that if we define the set 


E = {z € C : (f °9,)(2) = (gy f)(z)} 


(the set of points for which (4.38) holds), then E is nonempty (it contains z = 0) and open. 
Moreover, F is a closed set: if (z,,)7°, is a sequence of points in E and Z, > € as n > œ, 
then 


ll 


F PDE) = F ° z)( lim zn) = lim F © 91) 2p) 


= lim (Pr of) @n) = (Qy of) lim zn) = (Gr PO, 


so € isin E as well. 

We showed that E c C is closed and open (that is, it is a “clopen” set in topology 
jargon) and is nonempty. The complex plane C is connected, which means that its only 
clopen subsets are itself and the open set. Thus E = C. This establishes the lifting prop- 
erty of f and finishes the proof. 


Suggested exercises for Section 4.15. 4.10. 


4.16 Equivalence between complex tori and elliptic curves 


At the beginning of this chapter, we presented the topic of elliptic curves as motivation 
for the study of doubly periodic functions, but until now, we have not explained the 
precise way in which the study of doubly periodic functions is helpful for understanding 
the structure of elliptic curves. In fact, the connection between the two subjects is very 
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close and can be summarized by the slogan “elliptic curves are equivalent to complex 
tori.” The key lies in the differential equation (4.16) satisfied by (z), which implies that 
for a given lattice L c C with invariants g%, g3, the point (x,y) = (g(z), o' (z)) lies on the 
elliptic curve £ described in (4.2). Moreover, the map z + (g(z), g’(z)) is, when properly 
interpreted, a biholomorphism and an isomorphism of groups between the complex 
torus C/L and the elliptic curve €. 

The following result gives a fuller description of this intriguing and highly nonob- 
vious correspondence between two classes of objects. 


Theorem 4.25 (Equivalence between complex tori and elliptic curves). Let L c C be a lat- 
tice with associated invariants g, and g3. Let E = E(, 83) denote the elliptic curve 


E: y =È- gX- E 


over the complex numbers, including the point at oo. Then: 

1. The elliptic curve £ is nondegenerate and is equipped in a natural way with the struc- 
ture of a compact Riemann surface. 

2. The map 9: C/L — € defined by 


(e(z),@'(z)) ifz¢L, 
foe) ifz € L, 


ozn] 


is a biholomorphism of Riemann surfaces. 

3. IfE is also regarded as an abelian group with the group law defined as in Section 4.1, 
and C/L is viewed as a quotient group of C, then ọ is a group isomorphism in addition 
to being a biholomorphism. 

4. The association L +> E&(g, g3) is a bijection from the set of lattices onto the set of 
nondegenerate elliptic curves over C. 


The upshot of this result is the remarkable fact that the study of elliptic curves over 
C coincides (albeit in a rather nontrivial way) with the study of complex tori C/Z. In 
particular, we get that any elliptic curve is topologically a torus, which does not seem 
obvious from the definition. Moreover, the problem of classifying elliptic curves up to bi- 
holomorphism reduces to the already-discussed classification problem of complex tori. 

The proof of Theorem 4.25 is beyond the scope of this book and requires a more 
involved discussion of the group structure and Riemann surface structure on elliptic 
curves. For the details, see [61, Ch. 6]. 


Exercises for Chapter4 — 179 


Exercises for Chapter 4 


4.1 


4.2 
4.3 


4.4 


4.5 


4.6 


Prove that a topologically discrete additive subgroup of C must be the zero sub- 

group of the form wZ for some w € Z or of the form w,Z + w, with w4, w, linearly 

independent over the real numbers. 

Prove Lemma 4.9. 

Identify the precise region of convergence of the Laurent expansion (4.14) and 

prove the necessary bounds that justify that in that region the rearrangement in 

the proof of Theorem 4.11 is valid. 

To practice the technique demonstrated at the beginning of Section 4.7 that led to 

the Eisenstein series identities (4.19)-(4.21), use your favorite computer algebra sys- 

tem to extract additional Laurent expansion coefficients from the differential equa- 

tions (4.16) and (4.18) and see what kinds of explicit identities you get. 

Try to apply the method of proof of Proposition 4.14 by equating the coefficients 

of z?” in the Laurent expansions for both sides of (4.16) instead of (4.18). Do you get 

any new identities involving the Eisenstein series? 

This exercise explores an alternative and more direct method for proving the re- 

currence (4.22), which was found by Zagier [74]. 

a) To illustrate the idea behind the method in a simple example, consider the bi- 
variate rational function 


1 1 1 


R(s, t) = + + ; 
(so) st? 2st st 


Check that R(s, t) satisfies 


R(s,t) -R(s+t,t) —R(s,s+t) = En (4.40) 
s*t 


b) Sum both sides of (4.40) over all integer pairs s,t > 1 and perform a bit of 
creative rearrangement of terms to conclude that 


An 
(8) = 5802) 


(where ¢(s) is the Riemann zeta function). This is a nice identity in that for 
example it makes it possible to deduce Euler’s identity ¢ (4) = T from its easier 
cousin ¢(2) = r, 

c) Showthatifwe sum the sides of (4.40) instead over all pairs of complex numbers 
s, t in the “half-lattice” 


L, = {pw +qw; : p,q € Zwith p > 1or [p = 0andq = 1]}, 


then by an analogous calculation we in fact obtain identity (4.19) relating the 
Eisenstein series G, and Gg, which is the case k = 4 of (4.22). 
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d) Generalizing the idea above, let k > 2 and define 


2k-2 
1 1 1 1 
Re(s D = aa + 9 2 rekr i 
r= 
4 1 1 g2k-3 _ p2k-3 
stk | gtk-lp * pg2k-2p2k-2 gt 


Show that R;(s, t) satisfies the identity 


k-1 
1 
Ry (5, t) - Ry(S + t,t) - Ry(s,s +0) = > Tey (4.41) 
j=1 


(To practice your computer algebra skills and save yourself a tedious calcula- 
tion, see if you can get the computer to prove this for you!) 

e) Show that summing both sides of (4.41) over all integer pairs s, t > 1 yields the 
recurrence relation 


I Tee at : 
aurora (4.42) 


satisfied by the values of the Riemann zeta function at positive even integers. 

f) Showthatifwe assume that ¢(2) = r then (4.42), together with standard prop- 
erties of the Bernoulli numbers discussed in Exercise 1.15, can be used to give 
a new proof by induction of formula (2.10) from Chapter 2. 

g) Finally, show that summing both sides of (4.41) over all complex numbers s, t 
in the half-lattice L, as in part c) above gives exactly (4.22). 

h) The above calculations highlight an interesting connection between the values 
¢(2n) and the Eisenstein series G,,, wherein the former can be viewed as a 
certain limiting case of the latter Can you make this notion more precise? See 
Section 5.7 for additional clues. 

4.7 The Eisenstein series are known to satisfy other summation identities. As an exam- 
ple (taken from [57]), by extending Zagier’s method described in Exercise 4.6, or in 
any other way, prove the identity 


1 enya a N) 


Gont2 = Gad (any? é j Cena Gan-2he2- 


Gaa 1 


4.8 (a) Prove the following addition theorems for the Weierstrass g-function and its 
derivative: 


Mo) _ of 2 
ow) (2) - pw), (4.43) 


Q(Z + W) = H 
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CEWE eE 1 


3 
4\ p(z) - p(w) ) i (e(z) - p(w))’ 
x [p2p 2) - pwo w) 
- 2((2)*"(w) - e(w)*! (2) 
+ 3p(ZJp(w) (pzp (w) - e(w)g’ (z))]. (4.44) 


Guidance. It may be useful to note that assuming (4.43), the second iden- 
tity (4.44) is equivalent to the determinantal identity 


1 1 1 
p(z) gpw) e(z+w) 
g'(z) g'(w) -g'(z+w) 


=0. 


(b) Use (4.43)-(4.44) together with the fact that the Weierstrass g-function and its 
derivative parameterize the elliptic curve (4.2) to prove that formulas (4.3)- 
(4.4) define a valid group addition law on the elliptic curve € in (4.2). 
4.9 Prove the duplication formula 


12¢(z)? -g \” 
2! (z) ) ooh 


1 
(2Z) = i 


4.10 (a) Given a lattice L c C, identify the complex numbers A for which AL = L. 
(b) Given a lattice L c C, find all the conformal automorphisms of the complex 
torus C/L. 


5 Modular forms 


There are five elementary arithmetical operations: addition, subtraction, multiplication, division, 
and modular forms. 


Martin Eichler? 


5.1 Motivation: functions of lattices 


Our investigations of elliptic functions in the previous chapter gave rise to a host of in- 
teresting quantities associated with a lattice L c C; among them, the Eisenstein series 
G2, modular discriminant A, and Klein’s J-invariant. As we discussed in Section 4.14, 
these quantities can be viewed as functions of the modular variable t that we use to pa- 
rameterize (up to a trivial scaling operation) the space of lattices, associating it canoni- 
cally with the lattice L, = Z + TZ. Moreover, we saw that these functions satisfy inter- 
esting identities, such as the relations Gg = sG Gio = Š G4Gg and the more general 
recurrence relation (4.22). As we will see a bit later (Section 5.7), these types of complex- 
analytic identities encode identities of a purely number-theoretic nature; for example, 
the relation just mentioned between Gg and Gi is equivalent to the curious identity 


n-1 
a(n) = a(n) + 120 È a3 (k)o,(n-k) (n21), (5.1) 
k=1 


where o,(m) denotes the generalized sum-of-divisors function defined as 


O_(m) = $ d" (5.2) 


d\m 


(the sum of the a-powers of the divisors of m). And this is just beginning to scratch the 
surface of the wealth of remarkable phenomena these functions are involved in. 

From now on we will make the dependence on the modular variable t more explicit 
by writing G»;,(t), A(t), and J(T) instead of Gx, A, and J. At the heart of the phenomena 
mentioned above is the fact that the functions G»;(T), J(T), A(T) all satisfy interesting 
“transformation properties,” that is, functional equations that relate their value at T to 

at+b at+b 


their value at aaa for a certain class of Mobius transformations T > a This fact is 


essentially immediate from the definitions; we record it as a lemma. 


Lemma 5.1. The functions G2,(T) (k = 2), J(t), and A(T) satisfy the functional equations 


at+b 
af Seal 5) = (ct + d) Gyg(T), (5.3) 


1 This quote may be apocryphal; see the discussion in [W19]. 
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at+b 

i( a z) =J(T), (5.4) 
at+b = 12 

a( = 2) = (ct + d) Gag(T) (5.5) 


forallt € Handa,b,c,d € Z satisfying ad - bc = 1. 


Proof. Relations (5.4)-(5.5) follow immediately from (5.3) and the definitions of J (T) and 
A(t). To prove (5.3), apply definition (4.13) of G,, to write 


at+b 1 
Gn ( =~) = Se 
CT+d) cm mezr(a,a (M+ na) 
= (cr + d)* (m(ct + d) + n(at +b) 
(m,n)€Z?\(0,0) 
= (ct +d)* $ ((dm + bn) + (cm + anj)”. (5.6) 
(m,n)€Z?\(0,0) 


) -2k 


Denoting new summation indices p = dm + bn and q = cm + an or, in matrix notation, 


E a a 
p) \b d/\m/)’ 
we can rewrite the last expression in (5.6) as 


1 

ct + d)* ¥ ——__., 5.7 
(ct +d) Ea (5.7) 
where the summation ranges over the possible pairs (p, q) associated with (m, n) € Z? \ 
{(0, 0)} through the above linear transformation. However, the assumptions on a, b, c,d 
imply that the matrix (£ $) maps Z” \ {(0,0)} bijectively onto itself, so the summation 
range is exactly Z? \ {(0,0)}, and we see that (5.7) is precisely (ct + d) G(T). 


Coneptually, the transformation properties (5.3)—(5.5) can be regarded as a kind of 
family of internal symmetries of the functions G,,(tT), J(t), and A(t). As the easy calcu- 
lation above shows, these symmetries are simply a manifestation of the fact that the 
functions were originally defined in terms of infinite summations over a lattice, and so 
they must transform in a specific way when we switch from one fundamental period 
pair w4, w, generating the lattice to another. However, it turns out that functions with 
similar internal symmetries arise in many other places where the reason for the sym- 
metry holding is not nearly as self-evident (we will see examples of this later; see Sec- 
tion 5.13). The systematic study of functions with these types of symmetries, which we 
now undertake, is the beginning of the theory of modular forms, a rich subbranch of 
complex analysis that has strong connections to elliptic functions, number theory, and 
numerous other topics in mathematics. 


Suggested exercises for Section 5.1. 5.1. 
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5.2 The modular group I = PSL(2, Z) 


Lemma 4.1 and Theorem 4.24(b) give conditions for two lattices to be equal and homoth- 
etic, respectively. It is convenient to think about these types of equivalences in terms of 
group actions. The condition for the equivalence of two lattices w,Z+W,Z and w; Z+w,Z 
given in Lemma 4.1 can be interpreted as the statement that (w4, w2) and (w3, w}) are in 
the same orbit under the action of the general linear group of order 2 over Z defined 


by 


GL(2, Z) = {(¢ a : a,b,c,d € Z, ad-be=+1}. 


Our interest is mainly in describing lattices up to homothety, which means that we can 
consider the action of a smaller group. Let 


a b 


SL(2, Z) = \(° > : a,b,c,d € Z, ad - be = 1} 


be the special linear group of order 2 over Z. Note that SL(2, Z) has anormal subgroup 
{+I} of order 2 comprising the identity matrix J and its negative. We define the group T 
as the quotient group 


T = SL(2, Z)/{+I}. 


This group is known as the modular group (or in certain contexts as the projective 
special linear group of order 2 over Z). The notation I is in common use in the theory 
of modular forms. The alternative notation PSL(2, Z) is also sometimes used to denote 
the same group. 

It turns out that I is the “correct” group to work with for our complex-analytic pur- 
poses, since it measures the precise extent of nonuniqueness when studying lattices up 
to homothety and parameterizing them using the modular variable t as discussed in 
Section 4.14. This will be explained in Sections 5.3-5.4. We start however by thinking 
about I from a more abstract group-theoretic point of view. 

Working with quotient groups is a bit cumbersome, and in the case of I the quotient- 
ing is quite minimal, involving the identification of pairs +A of matrices. It is therefore 
common to abuse notation slightly and still denote elements of I as 2 x 2 matrices with 
the understanding that both such a matrix A and its negation —A represent the same 
element ofT and that all matrix equations written in this context are only assumed to 
hold modulo the subgroup {+I}. 


2 Note that we can get away with this without running into trouble as long as we only multiply matrices, 
as opposed to adding them or performing other operations that do not behave well under the quotienting 
homomorphism. 
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Three elements of I that play a special role in its analysis are the matrices 


H ae tal, Ni u-sr-(° P (5.8) 
-1 0 0 1 -1 -1 


Note that Sĉ = J (in the sense of the abuse of notation mentioned above), U? = I, and 
T= (4 k ), so that S, U, and T generate cyclic subgroups of T of orders 2, 3, and co, 
respectively. 


Theorem 5.2. The group I is generated by the elements T, S. 


Proof. Let A = (4 2) c T. We may assume that c > 0; otherwise, replace A by —A (recall 
that the two are equal as elements of T). We prove by induction on c that A can be rep- 
resented as a product of elements of the form S and a, k € Z. In the case c = 0, A is of 
the form (44), and since det A = ad = 1 and the entries are integers, actually 


a-(! en K a-(2 a ea 
01 o 1/0 1 


both of which are of the required form. 

For the inductive step, we assume that the claim has been proved in the case where 
the entry in the south-west corner of the matrix is strictly less than c. Dividing d by c 
with remainder, we let q,r > 0 denote the integers for which 


d=qc+r, O<r<c. 


Then 


est ae ac a 
“\ce d/\0 1/ \c r 


ATIS- ery d 2 Gi “) _. M. 
-r c r =C 


and therefore 


Applying the inductive hypothesis to the matrix M on the right-hand side, we see that it 
can be expressed as a product of group elements involving appearances of S and powers 
(negative or positive) of T. Therefore A = MST can also be expressed in such a way, and 
we are done. 


5.3 The modular group as a group of Mobius transformations 


In Section 4.14, we introduced the point of view whereby the space of lattices up to ho- 
mothety is parameterized in terms of the modular variable t taking values in the upper 
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half-plane. The following lemma adapts the statement of Theorem 4.24(b) to that new 
point of view. 


Lemma 5.3. Let t,t’ € H. The lattices L4 = Z + TZ and L, = Z + t'Z are homothetic if 
and only if t' is related to t via 


» att+b 
—_—— 
ct+d 


for some a,b,c,d € Z, ad - bc = 1. (5.9) 


Proof. Exercise 5.2. 


From Lemma 5.3 we see that the study of lattices up to homothety can be regarded 
as the study of the set of points in H quotiented out by the action of a group of Möbius 
transformations of the form (5.9). In fact, this group is canonically isomorphic to the 
modular group I with the isomorphism sending the element +( 42) of T to the Möbius 
transformation T +> at, In another small abuse of notation that is standard practice 
in the field, we still use the same letter T to denote this group and still refer to it as the 


modular group. That is, we write 


T = SL(2, Z)/{+1} = fı eee 
ct+d 


: a,b,c,d € Z, ad - be = 1} 


with the convention that the map T => GET ig simply another way to represent the group 
ct+d 

element +(¢ b ) of T. When referring to group elements, we will often use the same letter 

to denote an element of I thought of either as a matrix (with a + sign ambiguity) or as 

a Mobius transformation. In particular, the group elements S, T, and U defined in (5.8) 


have the expressions 
-1 -1 
S(t)=—, T(t)=T+1, U(t) = —— 
T T+1 


in their interpretation as Mobius transformations. 

Being able to switch at will between the two alternative points of view of working 
with matrices on the one hand and Mobius transformations on the other is convenient, 
since some arguments become simpler when considered from one of the points of view, 
and others are easier to understand from the alternative one. 


Suggested exercises for Section 5.3. 5.2, 5.3. 


5.4 The fundamental domain and the modular surface H/T 


Having identified the modular group as capturing the notion of the equivalence of two 
modular parameters 7, T’ that represent the same lattice, it is natural to ask for a com- 
plete set of equivalence class representatives, that is, a set of values of T such that each 
point in the upper half-plane is equivalent to precisely one. (This question is precisely 
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analogous to the idea that led us to the notion of a fundamental parallelogram in the 
study of elliptic functions.) The identification of such a set is one of the famous results 
of the field. It is given by 


D= freH : -3 <Re(t) <5 
2 2 


and 


kal >1or|r]=1,5 <argt< =I. 


We call D the fundamental domain under the action of T; see Fig. 5.1. 


-1 1 


Figure 5.1: The fundamental domain D. 


Theorem 5.4. The translates A(D), A € T of the fundamental domain D under the ele- 
ments ofT tile the upper half-plane without overlap, except for specific exceptions given 
below. More precisely, each t € H has a representation of the form 


T = A(To) (5.10) 


for some A €T and To € D. The point To is unique. The Möbius transformation A is also 
unique if To # i, eu If T) = i then there are precisely two distinct representations 


T= A,(i) = A,(i) 


where A,,A, € T are related by Ay = A,S. If Tọ = e}, then there are precisely three 
distinct representations 


T=A(e™) = A (eh) = Azle”) 


where A}, A», A3 € T are related by A, = A,U and A; = A,U’. 
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Proof: Lett € H. We prove the existence of A and Ty satisfying (5.10). Recall from (3.11) 
that for a, b, c,d € IR, we have the formula 


m(t). (5.11) 


(==) ad — bc 
m _ 
ct+d jet + dl? 


Im(t) 


In particular, Im(A(t)) = a 


for A = (45) e€ T. Now the set of points 
{er +d : (c,d) € Z? \ (0, 0)} 


is discrete and in particular disjoint from a neighborhood of 0; hence there exists some 
point of the form coT + dọ in this set for which |ct + d| is minimal. It is clear that for this 
Co» dy we must have gcd(cp, dọ) = 1 (otherwise, divide each of cy and dọ by their g. c. d. to 
get a pair with a smaller value of |ct + d|). This in turn implies that there exist integers 
dy and by for which agCo + body = 1, in other words, such that the matrix Ay = ( l a ) is 
an element of I. By the construction this A, has the property that Im(A)(z)) is maximal 
over all A € T. By replacing Ay by TA, for a suitable k € Z (thus replacing A(T) with 
Ao(T) + k, which does not affect the imaginary value) we can also assume without loss 
of generality that -4 < Re(ApT) < i, still retaining the maximality property. 

Having chosen Ap, denote T’ = A(T). We claim that |T'| > 1. To see this, assume by 
contradiction that |r’| < 1. Then letting B = SA, we have 


|Im(B(z))| = [Im(Sz’)| = |Im(-1/7’)| > |Im(z’)| = [Im(4g7)], 


contradicting the maximality property of Ag. 

Now if 7’ € D, then we can denote A = Ay To = T’, and get that (5.10) holds, so we 
are done with the proof of the existence claim. Otherwise, we must have |7’| = 1 and 
% < arg(t') < 5. In that case, let Tọ = St’ = -1/r' and note that Tọ € D, so that if we 
define A = (SA), then (5.10) again holds. Thus the existence of the representation has 
been proved. 

Now assume that 7 has two distinct representations T = ATy = A'T with To, Tọ € D 
and A,A’ € I. Our goal is to show that this can only happen in the specific situations 
listed in the theorem. 

Assume without loss of generality that Im(zj) > Im(z9) (otherwise, switch their la- 
bels). Denote B = (A’) ‘A. Then 7}, = Btp = ato+b Where a,b,c,d denote the entries of B. 


CT) +d’ 
Then by (5.11) we get that 


lcty +d] <1. (5.12) 


Since Tọ € Dandc, dare integers and To € D, there are not too many ways this inequality 
can hold. First, we could have c = 0 and d = +1. In that case, we must have a = d, and 
therefore B is of the form 
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eats es ek 
0 +1 0 1 


Then 74 = BTo = To + b. The conditions -4 < Re(To) < i then guarantee that b = 0, so B 
is the identity map, A’ = A, and Tj = Tọ, so that this is the case where the two represen- 
tations T = Aty = A’t, are the same, which is not relevant to the current discussion. 

A second possibility for (5.12) to hold is that d = 0, c = +1, and |t)| = 1. In that case, 


B is of the form 
d (i >) E G a a 
Therefore T = E + a, or alternatively, if we write Tọ = e anda = +a, then 
t= elt) i a. 
For this to hold with 7, Tọ elements of D and a an integer, we must have that either 
a=0, B=S, andt = Tọ =i, (5.13) 
or 
a=-1,B=T'S, andt = 1%) =e”. (5.14) 
In the first subcase (5.13), the two representations for tT become 
T = A(Ù = AS(Ù. (5.15) 
In the second subcase (5.14), we get that B’! = U, so the two representations are 
t = A(e™3) = AU (e). (5.16) 


The third and final possibility for (5.12) to hold is that c = d = +1and Tọ = T = erin, 
Assume without loss of generality that c = d = 1 (in the other case, replace a, b, c,d with 
the numbers —a, —b, —c, —d, respectively, which represent the same element ofT). In that 
case the condition ad — bc = 1 forces a = b + 1, and we see that B is of the form 


B= Q ja a) = T sy, 


1 1 
Then 
b+1 b 1 
anes CEUTE p To =b+ 
To +1 T +1 To +1 
=b+1 =b+1+e"3 —h 4147. 


~ eni/3 
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Thus we must have b = -1, and therefore B = (9 =!) = U, B = U™ = U*, and we get 
that the two representations for T are 


T = A(e™3) = AU (eh). (5.17) 


Summarizing, we showed that representation (5.10) is unique except for the three 
possible exceptions we identified, which are given by (5.15), (5.16), and (5.17). Those were 
precisely the exceptions listed in the theorem. This finishes the proof. 


The fundamental domain D can be thought of as the “arena” where modular func- 
tions and modular forms “live.” We will do all our analysis in reference to this arena. 
This is mostly straightforward, except for some technical subtleties that will arise when 
functions have zeros or poles on the boundary of D. (This is analogous to the issue that 
led us to consider fundamental parallelograms of the form P, (w1, wz) with an arbitrary 
origin point Zo in Chapter 4 as a way to avoid having to worry about doubly periodic 
functions that have zeros or poles on the boundary of the parallelogram. In the case of 
modular forms, this issue is harder to work around using a simple translation trick of 
that type.) 

We mention in passing that there is a more advanced, but conceptually clearer, point 
of view, in which the correct object to regard as the arena on which modular forms and 
functions are defined is the quotient space H/T, that is, the space of orbits of H under 
the action of I. This quotient space is equipped in a natural way with the structure of 
a Riemann surface and is called the modular surface. The fundamental domain D is 
just one particular coordinate chart (in the sense of being an element of the atlas of 
charts a Riemann surface and other manifold-like objects come equipped with) that is 
used to perform calculations on it. Understanding this point of view will make various 
arguments and calculations in some of the proofs in this chapter appear more intuitive 
and motivated but is not strictly necessary from a formal point of view, so we will not 
discuss the details of how such arguments can be presented from the point of view of 
Riemann surfaces. 


Suggested exercises for Section 5.4. 5.4. 


5.5 The classification problem for complex tori, part II 


We now return to the classification problem for complex tori discussed in Section 4.15. 
Previously we solved the first part of the problem when we gave a necessary and suf- 
ficient condition for two tori C/L and C/L’ to be biholomorphic. Now we can use the 
results of the previous section to give a solution to the second part, namely finding a 
canonical system of representatives under this equivalence relation on the family of 
lattices. 
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Theorem 5.5 (Classification of complex tori; second part). The family of complex tori 
{C/L, : TED} (5.18) 


(where L, = Z + TZ as before) forms a complete set of biholomorphism representatives 
of the complex tori C/L, that is, each complex torus C/L is biholomorphic to C/L,, for 
precisely one Ty € D. If L is given explicitly as L = WZ + w,Z with W/W, € H, then To is 
the unique element of D related to T = w,/w; via (5.10) for some A €T, with the biholo- 
morphism being the homothety z > wz (more precisely: the map of Riemann surfaces 
whose lifting is the homothety map, in the sense discussed in the proof of Theorem 4.24). 


Proof. First, we show that no two elements of the family (5.18) are biholomorphic. As- 
sume that T4, 7, € D where C/L,, and C/L,, are biholomorphic. Then by Theorem 4.24(a), 
L, and L, are homothetic. By Lemma 5.3 we have 


T, = mt =A(t,) for some A = C 1) eT. 


Of course, T, can also be represented as I(t,), where J is the identity element of T, so 
since T4, T € D, the uniqueness claim in Theorem 5.4 implies that tT, = T3. 

For the remaining claim that the tori (5.18) include a representative of all biholo- 
morphism classes of complex tori, let L = WZ + WZ be a lattice, where the ordering 
of w4, w, is chosen such that T := w/w, € H. Let Tọ € D be the unique point in the 
fundamental domain, guaranteed to exist by Theorem 5.4, such that 


at) +b 


a CT) + 
0 


for some A = (e n) eT. 
c d 


By Lemma 5.3 the lattices Z + TZ and Z + T)Z are homothetic, that is, we have 
Z+TZ=XZ+ TZ) 
for some A + 0. It then follows that 
L =Z + WZ = (Z + TZ) = WAZ + TZ) = WAL, . 


Thus L and L,, are also homothetic, and by Theorem 4.24(a), C/L is biholomorphic to 
C/L,,, as claimed. 


5.6 The point at ico, premodular forms, and their Fourier 
expansions 


In the sections below, we will start defining certain classes of functions that generalize 
properties (5.3)—(5.5) of the explicit functions we constructed. All of them will share one 
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particular property that will be useful to name: we say that a function f : H —> Cisa 
premodular form’ if it is 

1. holomorphic; 

2. periodic with period 1, that is, satisfies f(z + 1) = f(t) for all t € H; and 

3. for some constant C € R, f (T) satisfies the asymptotic bound 


F(z)| = O(e'™™) as Im(t) > oo, uniformly in Re(r). (6.19) 


We say that a function f : H — C is a weak premodular form if it satisfies the same 
conditions as for a premodular form, but with the first condition being relaxed to that 
of f being meromorphic. 


Proposition 5.6. Let f : H — C. Then f (T) is a premodular form if and only if it has an 
expansion of the form 


oo 5 
f= ¥ ame (teH), (5.20) 
n=-m 
which converges absolutely, uniformly on compacts in H, and where m > 0 is an integer. 
We refer to expansion (5.20) as the Fourier expansion off. The coefficients a(n) are called 
the Fourier coefficients off and can be recovered as 


1/2 
a(n) = | f(x +e 
-1/2 


2ni) dx (y > 0 arbitrary). (5.21) 


Proof. The change of variables q = e”” defines the bijective correspondence 
f(t) — ga) 
defined via the relation 
f(0) =g") (5.22) 


between holomorphic functions f : H — C that are periodic with period 1 and holo- 
morphic functions g : D \ {0} — C on the punctured unit disc. If we add the assumption 
that f(T) satisfies a bound of the form (5.19), then that translates to the condition that 
g(q) must satisfy a bound of the form 


leo| = 0q”), q> 0, 


3 Note that this term is not standard in the literature. 
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for some constant M € R. Of course, this asymptotic bound is nothing particularly ex- 
otic; it is easily seen to be equivalent to the statement that g(q) has either a removable 
singularity or a pole at q = 0. From Section 1.18 and Exercise 1.43 we know that any such 
function has a Laurent series of the form 


CO 


sg = > and", 


n=-m 


which converges (absolutely and uniformly on compacts) for q in the punctured disc. 
Translating this back to the language of f (T), this shows exactly that the condition of f 
being a premodular form is equivalent to it having the Fourier expansion (5.20) with 
the appropriate convergence. Finally, the coefficients can be extracted in the usual way 
as an integral on the circular contour {|q| = r}, 0 < r < 1, using the residue theorem. 
Specifically, if we denote for convenience r = e *”, y > 0, then we have 


1/2 27x i 
a(n) = 1 &() dq _ 1 | &(re A (2rti)re™ dx 


ori qrt i prtle2ni(nt+ 
Iql=r -1/2 
1/2 1/2 
= | g (eD) g tainty) dx = | f(x 4 iy)e noD) dx, 
-1/2 -1/2 


which is exactly (5.21). 


As we see from the proof above, the growth restriction on |f(t)| as Im(T) — oo 
for premodular forms is equivalent to the statement that under the change of variables 
q = e7", such a function expressed as a function of q is a holomorphic function on the 
punctured unit disc with a pole or removable singularity at q = 0. This suggests intro- 
ducing the notion of “the point at ico” as a way of discussing the behavior of premodular 
forms near q = 0 while still thinking in terms of the variable t. We will use the notation 
D = D u {ico} to denote the fundamental domain with this point at ico added. We will 
refer to D as the extended fundamental domain. We also introduce the following bit 
of terminology to describe the behavior of f(t) near the point ioo: if the function g(q) 
associated with f(r) as in (5.22) has a pole of some order k > 1, then we say that f (T) has 
a pole of order k at ico. If g(q) has a zero of order k > 1at q = 0, we say that f (t) has a 
zero of order k at ico. As usual, we can unify those two concepts and regard both zeros 
and poles as two aspects of the same thing by declaring that f (T) has a (generalized) 
zero of order k att = ico if g(q) has a generalized zero of order k at q = 0 in the sense of 
having an ordinary zero of order k if k > 1, a pole of order —k if k < 0, or neither a zero 
nor a pole if k = 0. (Refer to the parallel discussion on this terminology in Section 1.10.) 
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5.7 Fourier expansions and number-theoretic identities 


The functions G.;(t), A(t), and J(T) are all periodic functions with period 1 and, as we 
will see shortly, satisfy the growth condition (5.19) for being a premodular form. It turns 
out that their associated Fourier expansions, which we will now derive, are extremely 
interesting and lead to identities of a purely arithmetic nature. 


Theorem 5.7 (Fourier expansion of the Eisenstein series). For k > 2, the Eisenstein series 
Go,(T) is a premodular form and has the Fourier expansion 


*\2k œ i 
= $ oxang (q= e" t € H), (5.23) 


Gox(T) = 2¢(2k) + 2 Ok- D! 2, 


where 6o;_4(n) is the generalized sum-of-divisors function defined in (5.2). 


Proof. Start with the partial fraction expansion of the cotangent function 


T cot(7z) = DR (5.24) 
Z &\Z+n n 


n0 
(see (1.73)). Differentiating this expansion p times gives 


cae cot(7z)) = (-1)?p! py g 


Jz ? T (p 21). (5.25) 


On the other hand, note that for z € H, 


cosaz . 2 i S i 
T cotmz =1— =ri(1 : )- ti 1429 e77 ; 
sin 7z J — e2miz 


£=1 


and therefore also 


p oo l 
L cot(mz)) = (27i) Y Pe™ (p >21). (5.26) 
e=1 
Now 
1 Cc 
G: (T) = + 
2k im; A o (MT + n)2k Be nek 2 2 eae (MT + -a 


= 2¢(0k) +2 5 + 


FS eae AE oer 
co pyi a qe 


= 2¢(2k) +2 
ee 2 (Qk =D! mT g 


2 > (-1)* (ai)? S 2k14 _2riemt 
SAONE? E TE > e 


(7 cot(zmz)) 
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z 2(27ri)” 5 2k-1 \ 2nint 
= 20 (2k) + EN Ox: Je 


em=n 


2(2ri)** S 2nint 
A Le (5.27) 
ax 2 


= 2¢(2k) + 
which is the claimed expansion. Since 0,,_,(n) is bounded by a polynomial in n, expan- 
sion (5.23) clearly converges absolutely and uniformly in a neighborhood of q = 0 and 
defines a holomorphic function there. As we remarked in the previous section, this im- 
plies that G(T) is a premodular form. 


Theorem 5.8 (Fourier expansion of the modular discriminant). The modular discriminant 
A(t) is a premodular form. Its Fourier expansion is given by 


A(z) = (27) (q - 24q° + 252g - 14729" + ---) 


=(2n)” X rinq” (q=e™",t eH). (5.28) 
n=1 


Here the normalized coefficients (t(n))p2, are a sequence of integers, which are given ex- 
plicitly by 


n 
T(n) = 8000 È ookan -j - k) - 147 ¥ a5 (j)o5(n - j) (5.29) 
= i 
for alln > 1, where o, and a; denote the e sum- FaR aes functions as before 


with the additional convention that 03(0) = z0 h and o;(0) = 


Proof. We have A(t) = 60°G,(z)?-27-140"G¢(t)*, so A(z) slain inherits the property of 
being a premodular form from G,(t) and G,(z). To get its Fourier expansion, note that, 
by (5.23) and (1.95), 


3 


4 
60°G,4(z) = bee goo > o3(n)q" ) 


3 


mre (È ox(n)q” ) 


= (270) - 8000 J $ 03(j)03(k)o3(n -j - o) 
neo\ fi 


1 foe) n 
= am” —— + 8000 ( 03(j)03(k)o3(n -j — oje) 
1728 a È ARN aly 


jtksn 


and, similarly, 
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2 


an® 2m & 
27 140° Gary = 27 - 140? = 2 £ 
6(T) me aa 2, aod 


2 


140° / & 
= (2m) -4-27 - l- ao") 
1202 2 


= (27)? -147 5 6 05(j)o5(n -D)e 


n=0 \j=0 


vf 1 Tee Non 
= (277) ( T SÈ o5(j)o5(n -»)i ) 
Subtracting these two expressions leads to (5.28)—(5.29). 

It remains to show that t(n) is an integer. Observe that in representation (5.29), all 
the summands are integers, except possibly those for which one or both of the summa- 
tion indices j, k are equal to 0. The total contribution of these exceptional summands to 
t(n) can be expressed as 


n-1 
3x8 00003(0)"o3(n) + 3 x 80000;(0) $ 03(k)o3(n — k) + 2 x 1470;(0)0;(n) 
k=1 
5 het 7 
= 7703(n) + 100 2, olat -k) + 5 a5(n) 
n-1 3 5 
5d° + 7d 
= 100 k -k ——— 
> ox(k)o(n-k) + ¥) — 


k=1 din 


This is in fact an integer, since it is easy to check that 5d° + 7d? is divisible by 12 for any 
integer d. (Another famous formula for A(T) that we will prove later makes it immediate 
to see that the t(n) are integers; see Theorem 5.31 in Section 5.14.) 


The sequence of normalized Fourier coefficients 
(z(n)) a = 1, -24, 252, —1 472, 4 830, —6 048, ... 


of the modular discriminant is called Ramanujan’s tau function." It is a celebrated 
mathematical object with many remarkable properties. To name one example, one of 
the surprising results of the theory of modular forms, which we will not prove here, is 
the following property, conjectured by Ramanujan in 1916 and proved shortly afterward 
by Mordell. 


4 Beware the small notational quirk of the theory wherein the letter T is used to denote both the se- 
quence t(n) and the modular variable T. 
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Theorem 5.9 (Multiplicativity of Ramanujan’s tau function). If m,n > 1 are relatively 
prime integers, then T(mn) = T(m)T(n). 


For the proof, see [5, Ch. 6]. 
Theorem 5.10 (Fourier expansion of Klein’s J-invariant). Klein’s J-invariant is a premodu- 


lar form and has the Fourier expansion 


1 1 2 
= —— | — + 744 + 196 884 21 493 760 see 
J® ra +196884q + P) 


= al; Ş § ama) (q = e7",t € H). 


The coefficients c(n) are all positive integers. 


Proof. Exercise 5.5. 


The coefficients c(n) are also a much-studied sequence of numbers. In the late 1970s, 
they were found to be related to dimensions of the irreducible representations of the so- 
called monster group, a connection that was developed into a deep mathematical the- 
ory and is sometimes referred to as monstrous moonshine. The story of this discovery 
and some of the amazing mathematical ideas it led to is told in [29]. 

More mundane, but still interesting, is a result due to Petersson from 1932, which 
states that the asymptotic rate of growth of the coefficients c(n) is given by 


1 Ann 


asn > oo. (5.30) 
This result is conceptually related to another famous result, the Hardy-Ramanujan 
formula for the asymptotic rate of growth of the number p(n) of integer partitions of n. 
That formula states that 


1 m2n/3 
e 
4v3n 


Both (5.30) and (5.31) can be proved using complex analysis; see [22], [66, Appendix A]. 
The Fourier expansions (5.23) and (5.28) make it possible to translate various iden- 
tities involving the functions G», and A into number-theoretic identities. 


p(n) ~ asn > oo. (5.31) 


Theorem 5.11. We have the following number-theoretic identities for all n > 1: 


n-1 
a(n) = a(n) +120 È 03(k)o,(n - K), (5.32) 
k=1 


n-1 
da (Nn) = z(a —1003(n) + 5040 $ 03(k)o5(n - o) (5.33) 
k=1 
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n-1 


a(n) = 110(N) - 1003(n) + 2640 X 03(k)o9(n - K), (5.34) 
k=1 
x(n) = E oyo + Sion - S ay os(ke)o5(n — K). 635) 


756 


Proof of (5.32) and (5.33). We consider the Fourier expansions of both sides of the iden- 
tity Gz = 3G; from Section 4.7. By (5.23) the left-hand side is 


8 œ 8 co 
Galo) = 2¢(8) +2 F F omg = 2—) ( aa enn") 
n=1 


7! n=1 7! 
The right-hand side is 
2 
(2c) 2% (ari) ") 
7 3! 
nÈ 327” , Ben? œ /n-1 
47235 ` 315 4 Y oog" 71 2 2o )o;(n —k) Jq 


Equating the coefficients at q” in the above expressions gives identity (5.32). 
Identity (5.33) follows similarly from the Eisenstein series identity Gi) = 2.G4G¢, 
which we also discussed in Section 4.7. We omit the details of this simple calculation. 


The principle behind identities (5.34) and (5.35) is similar. They follow by equating 
the Fourier coefficients in the Eisenstein series identities 


6 
Gy = 73 0210» (5.36) 
A = 1200(1 430G} - 69162), (5.37) 


respectively. These are not identities that we have previously derived, but they are con- 
ceptually similar to (4.19)-(4.21) and can be proved without great effort using the results 
of Section 4.7. However, rather than pursue this method, we will instead show in Sec- 
tion 5.12 a more elegant way of obtaining them (and similar identities) by applying more 
general ideas we will develop about modular forms. 

Many more identities with a similar flavor to (5.32)-(5.35) are known to exist and 
can be proved using modular form techniques (or, through a much more painstaking 
analysis, using manipulations of a purely elementary nature [63]). As an example of a 
more sophisticated identity whose proof requires additional background, we mention 
the following identity due to Niebur [50]: 


n-1 
t(n) = no(n) -24 Y (35kf —52k°n + 18k’n?)o,(k)o,(n —k). 
k=1 
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The above discussion gives a glimpse into some of the close connections that ex- 
ist between arithmetic and modular forms. As seen in the examples above, one way in 
which these connections manifest themselves is that the Fourier coefficients of naturally 
occurring modular forms (or mildly renormalized versions of them) are often integers 
with interesting arithmetic properties. 


Suggested exercises for Section 5.7. 5.5. 


5.8 Modular functions 


A meromorphic function f : H — C is called a modular function if it is a weak pre- 
modular form and satisfies 


at+b 
=f(t 5.38 
(Z=) -r0 (538) 
for allt € Hand A = ( £ $ ) € T. That is, a modular function is a true meromorphic func- 
tion on the modular surface (including also the point ico). Note that since T is generated 
by the elements T, S, to verify the modular invariance property (5.38), it suffices to check 
that f(T) satisfies 


f(t+)=f(r), f(-l/t)=f@), (teH). (5.39) 


(The first of these two equations is already guaranteed by the condition that f(z) is a 
weak premodular form.) 

A modular function f(z) that is not the zero function has only finitely many zeros 
and poles in D: indeed, the zeros and poles cannot have ioo as an accumulation point 
(otherwise, ico would be an essential singularity rather than a pole or removable singu- 
larity), which means that all the zeros and poles of f (t) in the closure cl(D) are concen- 
trated in the intersection of the closure with the strip {0 < Im(z) < M} for some M > 0. 
This intersection is compact, so if there were an infinite sequence of zeros or poles of 
f(T) in it, it would have an accumulation point, so it would be identically zero or have 
an essential singularity in H, which is not allowed. 

An essential property of modular functions is analogous to Proposition 4.6 we en- 
countered in our discussion of elliptic functions in Chapter 4; loosely speaking, it states 
that the total number of zeros of a modular functions in the fundamental domain is 
equal to its total number of poles (as usual, counted with multiplicities, and the point 
ico needs to be included in the count as well). An additional caveat in the current set- 
ting is that the “numbers” being referred to are actually weighted counts of points with 
respect to a certain weight function. We define the weight w(r) of a point t € D by 
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4. <# _: 
5 ift =i, . 
i ifr=e™3, 
wao)=43 
1 ift=ioo, 
1 otherwise. 


Theorem 5.12 (The weight formula for modular functions). Let f : H — C be a modular 
function other than the zero function. Then 


Yy w= X wo. (5.40) 
f&)=0 f(Q)=00 


Here the summation on the left-hand side ranges over zeros č of f (t) in D, counted with 
multiplicities, and the summation on the right-hand side ranges over poles ¢ of f(t) in D, 
counted with multiplicities. In both summations, we include the point ico with appropriate 
multiplicity if f(t) has a zero or a pole there. 


Proof. Consider a contour integral of the form 


F'O 
dt (5.41) 
J f(t) 


around a suitable contour G that, as a first approximation, hugs the boundary of the fun- 
damental domain D up to some vertical level M in the imaginary direction (Fig. 5.2(a)). 
The parameter M > 0 is chosen larger than the imaginary values of any of the zeros 
or poles of f(z), other possibly than the point ico (we discussed earlier why such an 
M exists). Now the general idea of the proof is to evaluate the contour integral in two 
ways. This is not conceptually difficult, but involves some technicalities of a somewhat 
tedious nature (which are nonetheless essential to check carefully), so to make things 
clearer pedagogically, we build up the calculation in several successive versions, each 
improving on the previous one. 


First version. In the first version of the proof, we assume for simplicity that f (T) does 
not have any zeros or poles on the boundary of D. The integration contour in that case 
takes the form shown in Fig. 5.2(a) and decomposes as a sum of five subcontours 


G = Y1 + y2 + Y3 + Y4 + Y5- 


Denote by X the total number of zeros of f(z) in D and by Y the total number of poles, 
counted with multiplicities. Then the integral (5.41) is equal to 27i(X — Y). 

On the other hand, denote by Z the order of the zero f(z) has at ico (with the 
usual convention that Z is taken negative if f (T) has a pole there). Breaking up the inte- 
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yı 


(a) 


Figure 5.2: (a) The integration contour G used in the first version of the proof. (b) The modified version of 
G with detours added around zeros and poles on the boundary of D. The detours on y; are images of the 
detours on y4 under the inversion map T +> —1/T. (c) The third version in which detours (labeled yg, y7, Yg) 
are also added around i, ers and e”, 


gral (5.41) into the integrals over the five subcontours yj, 1 < j < 5, we wish to show that 
it is equal to 277iZ. This will give the equation 


X+Z=Y or equivalently, X =Y -Z, 


and one or both of these two equations (depending on whether Z > 0, Z = 0, or Z < 0; 
that is, whether ico is a zero, a pole, or neither a zero nor a pole) is what (5.40) claims 
under our simplified assumptions on f (T). 

Start with the contributions to (5.41) from the subcontours y, and y3. Those are 
trivially seen to cancel each out, summing up to 0 because of the periodicity property 
f(t+1) =f (2). 

Second, we show that the contributions from the subcontours y, and y; likewise 
cancel each other out. This follows by making the change of variables p = -1/T in the 
integral over y;, which maps y; to —y, and therefore gives that 


f'(a) f'(-1/p) dp f'(p) 
dt = = dp. 5.42 
| fio | EEE | Fey © pe 


Ys =Y4 Va 


(The last equality follows from the relation 7 ’f'(-1/t) = f' (T), obtained by differenti- 
ating the second identity in (5.39).) 
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The contribution from the integral over the remaining subcontour y, can be eval- 
uated by once again using the change of variables q = e”, which transforms this sub- 
contour into a circle of radius e *™ around q = 0 in the q-plane. Denote $(q) = f(t); 
as discussed in the proof of Proposition 5.6, @(q) is a holomorphic function on the punc- 
tured unit disc because f (T) is periodic and has a zero of order Z (or a pole of order —Z) 
at the origin. Under this change of variables, we have 


FD =p OTO, 


and therefore 
OPR AC _ $A 
—— qa 
Fo pa l = go 
which then implies that 
£O 4 ¢'(q) 
dq, 
om ay HO 
q\=e 


nicely mapping the integral to a similar-looking one in the variable q, except that the in- 
tegral in the q variable is over a closed curve (and receives a minus sign since the change 
of variables maps the subcontour y, to a circle oriented in the negative (clockwise) di- 
rection around q = 0). By the argument principle (Theorem 1.48) this last integral is 
equal to 27i times the number of zeros minus the number of poles of 6(q) inside the cir- 
cle |q| = e *™. Since M was taken large enough so that f(z) has no zeros or poles with 
imaginary value greater than M, its value is equal to 277iZ. 
Putting the above results together, we have shown that 


Z ~ Ji ad Fea 
= Oi ail J roa 2) | E ar) 
(fF ars [F a) f a| =0+0- = 


As we pointed out earlier, this was exactly the equality needed to balance the books and 
conclude that (5.40) holds. 


Second version. For the next iteration of the proof, consider a situation in which f(t) 
might now have zeros or poles on the boundary of D but assume that it does not have 
poles at t = ior t = e”/3. The above proof can then be amended by modifying the 
integration contour G to add small “detours” bypassing each of the boundary zeros and 
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poles, as shown in Fig. 5.2(b). The requirements for the detours are as follows: first, the 
detours on the subcontour y, of G dip into D, are matched by detours of the same shape 
moving away from D along the subcontour y,, and are small enough so that each detour 
goes around exactly one of the distinct zeros and poles. In this way, the contributions to 
the integral (5.41) from the subcontours y; and y; still trivially cancel each other out as 
before. 

Second, for the detours along the two circular arc segments y, and yz, they also 
move away from the unit circle in opposite directions, with the detours on y, dipping 
into the fundamental domain, and those in y; moving away from it. The precise shapes 
of these detours are not important; it is important to make them small enough (so that 
each detour only goes around a single pole or zero), and the shape of each detour around 
a zero or pole To along y; should be associated with the shape of the detour around the 
“reflected” point —1/Tọ lying along y, in such a way that y; coincides with the image of 
y, under the map T +> -1/T. 

Again, because the contours y, and y; have been matched to each other as we de- 
scribed, we will still have cancelation of the contributions to the integral (5.41) from y, 
and y; (since the first equality in (5.42) remains valid). 

Now, with the modifications to the contour G described above, you can easily con- 
vince yourself that the integral (5.41) is still equal to 277i(X — Y), where X and Y denote 
the same quantities as before. As a result, all the arguments from the first version of the 
proof remain valid, and we conclude that (5.40) holds in the same way as before. 


Third version. So far we have avoided thinking about zeros and poles at t = i and 
t = e*"/3 so we did not really have to grapple with the question of where the weights 
1/2 and 1/3 in the definition of the weight function w(t) come from. We now prove the 
theorem in its full generality, in the setting where f (T) is allowed to have zeros and poles 
on the boundary of D, including possibly at t = i and T = e/3 Let X,Y,Z be as before, 
except that we now define X and Y more carefully as the respective numbers of zeros 
and poles of f that are in D other than the points t = i and T = e*/3_ Now denote 
additionally by Q and R the orders of the zeros of f(t) at t = iandt = e*”/?, respectively 
(again with the convention that they are negative if we have poles instead of zeros). In 
this setting, we modify the contour again, introducing additional detours around T = i, 
t =e? and t = e}, as shown in Fig. 5.2(c). These detours are taken as circular arcs 
of some radius r, chosen small enough so that no other zeros or poles of f(z) lie within 
distance r of the special points t = i,e”/>, and e/°, 
With this notation, the decomposition of G into segments now has the form 


G = Y1 + Y2 + Y3 + Yat Ys + Yel) + y(r) + yg(r), 


where yg = ye("), y7 = y(r), and yg = yg(r) denote the three added circular arcs; we 
emphasize their dependence on r in our notation for reasons that will become clear 
shortly. 


204 —— 5 Modular forms 


Now retracing the reasoning in the previous version of the proof, we see that the 
conclusion X —- Y = -Z is now modified to 


P 1 f'(t) 1 f'(t) 1 f'(t) 
X-Y= Z+ zi | 70 dt + zi | FO d+ zi | TO dt. (5.43) 


Yer y7(r) y(r) 


To understand the contribution from the new integrals over y¢(r), y7(r), and y(r), con- 
sider the local behavior of f(t) near t = i, e””/?. Using our notation Q,R for the orders 
of the zeros at these exceptional points, we can factor f (T) as 


f(t) = (t-)°8(0) 


for t in some neighborhood V of i, with g(t) being holomorphic and nonzero in that 
neighborhood. Therefore the integral over y,(r) can be evaluated (assuming that r is 
small enough so that the disc of radius r around i is contained in V) as 


| Fas" | Eee, | Oar 


yr y(r) y(r) 


Denote by 0, the angle subtended by the circular arc y,(r) (relative to the center point 
i of the circle of which that arc forms a part). Then by explicit parameterization of the 
integral of 1/(z — i) above, it is easily seen that that integral (without the constant Q in 
front of it) is equal to —6,.. For the second integral involving g’(t)/g(z), we can bound it 
as 


g'(t) 
| a 


< 27mMr, 
y(r) 


where M is a positive constant such that |g’ (T)/Z(T)| < M for t € V. Thus we have shown 
that 


$ 
| aI dt = -Q0, + O(r) (5.44) 
y(r) 
for small r. Furthermore, it is geometrically obvious (and trivial to show formally if 
desired) that 0, — 7 asr — 0. 
Similarly, the integrals over y(r) and y(r) can be understood by writing a factor- 
ization for f (T) of the form 


fote Fha, (5.45) 
valid in a neighborhood of gue 
hood. By the periodicity of f this also implies that for t in a neighborhood of e 
e2/3 41, we have a similar factorization 


, with h(t) holomorphic and nonzero in that neighbor- 
mi /3 = 
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f(t) =f -1 = (r-e ha - 0, (5.46) 
where again T + h(t-1) is holomorphic and nonzero in the neighborhood of e™/?. From 
representations (5.45)—(5.46) by a similar calculation as the one that led us to (5.44) we 
get that 

f'@ f'@) 
—— dt=-Rd,+O(r) and | —— dt = -R@d, + O(r) 
| oe on 


Yer yg(r 


for r near 0, where @, denotes the angle subtended by each of the circular arcs y,(r) and 
yg(r) relative to the center points e””/? and e™’? of the circles of which these arcs are a 
part. It is easy to see that ¢, — 7/3as r — 0. 

Combining (5.43) and the other results noted above, we have shown that 


X-Y=-Z y frr or) 


for r near 0. Passing to the limit as r — 0, this becomes 


which, as we see upon inspection, is simply another way of writing (5.40). 


Corollary 5.13. Let f : H — C be a nonconstant modular function. Then f takes on any 
value an equal number of times in D; that is, the weighted number of zeros of f(t) — a in 
D calculated in the sense of the left-hand side of (5.40) is the same for any a € C. 


Proof. The right-hand side of (5.12) remains the same when we replace f (T) by f (T) - a. 


Corollary 5.14. A modular function without poles in D is a constant. 


Proof. Iff is a modular function without poles in D, then f(t) must in fact be bounded, 
since f is bounded in a neighborhood of ico (that is, a half-plane of the form {Im(r) = 
M}, and separately from that, it is bounded in the (compact) intersection of {0 < Im(t) < 
M} with the closure cl(D) of the fundamental domain. 

Now since f is bounded, that means that for some a € C, the equation f (T) = a has no 
solutions. By Corollary 5.13, iff were nonconstant, then the equation f(t) = a would have 
no solutions for all a € C, which is obviously impossible. Therefore f is a constant. 


5.9 Klein’s J-invariant 


Let us now apply some of the understanding we developed on modular functions to 
Klein’s function J (T). 
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Lemma 5.15. The Eisenstein series G(T) satisfy 


Gx(i)=0 ifkis odd, and 
Gy(e™!?) = 0 is k is not divisible by 3. 


Proof. We have 


1 1 
Gox(i) = $ ECT = $ EOK 
ommezao,o) + ni) (mnyez2\(0,0) © Cim + n) 
k 1 k i 
ON al, e a a oe 
) 


(m,n)€Z?\ (0,0 


which implies that G»x(i) = 0 if k is odd. Similarly, we write 


Since e 


Gox 


—Akrti/3 


1 
(m F ne2mi/3)2k 


1 
e2(2k)ni/3 (me-27t/3 + n)2k 


(eh) = 
(m,n)€Z?\(0,0 


(m,n)€Z?\(0,0) 
—Akrti/3 1 


=e r: 
j (m(-e2"/3 — 1) + n)2k 


(m,n)€Z?\(0,0 


-4kni/3 1 
=e = 
ee ((n — m) + (—m)e*/3)2k 


-4kri/3 1 -g Akat/3.¢ a e?7il3) 
2 . 


=€ (p + qey ~ 


(p.q)€Z\(0,0) 


+ Lif k is not divisible by 3, the desired conclusion follows. 


Proposition 5.16. The function J(t) is a modular function. At the special points T = i, 


T=e 


2mi/3 


, and T = iœ, it takes the values 


(23) =0, J@=1, Jli) = 00. 


The zero at e”"'? is of order 3, the zero of J(t) -1 at t = iis of order 2, and the pole at ico 


is simp 


le. 


Proof. We know from Lemma 4.23 that A(T) is never zero for T € H, and from the Fourier 
expansion (5.28) we see that A(z) has a simple zero at T = ico. Therefore J (t) has a simple 
pole at ico and no other poles. We can also see using Lemma 5.15 that 


gD? 


J® = eT 
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since g3(i) = 140G,(i) = 0. Similarly, g,(e*”/*) = 60G,(e"/) = 0, so 
Wem) = §le lies aa 

= “A(e27/3) 
Now the zero of J(r) at e?™™ must be of an order that balances out the simple pole at 
ico in accordance with (5.40). This implies that it is a zero of order 3. Applying the same 
reasoning to the zero of J(T) — 1 at T = i shows that that zero is of order 2. 


Corollary 5.17. The function J(t) takes on any value in D exactly once; that is, the 
weighted number of zeros of J(t) — a in D calculated in the sense of the left-hand side 
of (5.12) is equal to 1 for anya € C. 


Proof. By Proposition 5.16 the right-hand side of the sum in (5.40) for the case f = J is 
equal to 1. The claim therefore follows from Corollary 5.13. 


We now show that J (T) gives rise to all possible modular functions, as the next result 
explains. 


Theorem 5.18. A meromorphic function f : H — C is a modular function if and only if it 
is of the form 


f(t) =RU) 


for some rational function R(w). 


Proof. The “if” part is obvious; for the “only if,” let f(t) be a modular function that is 
not identically zero. Denote by Uio the order of the zero of f (t) at ico. Denote by u; the 
order of the zero of f(z) at i. Denote by u, the order of the zero of f(t) at p := e’. In 
these definitions, we use the usual convention that u4 (for a = i, p, ico) is negative and 
equal to minus the order of the pole at a if there is a pole at that point instead of a zero. 

Denote the zeros of f in the fundamental domain, counted with multiplicities but 
excluding the points i, p, ico, by Z;,...,Z,. Denote the poles of f in the fundamental do- 
main, with the same exclusions, by wy,..., Wx. 

Relation (5.40) translates to the concrete statement that 


1 1 
N+ Miog + zki + hp = k. (5.47) 


Since n, k, and Uio are integers, we see that u; must be even, and up must be a multiple 
of 3. 
Now define the function 


re p/3 I- 10 (1) -J(zj )) 


g(t) = 00-1 a 
TIO -Jw 


(5.48) 
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Let A(t) = f(t)/g(t). This is a modular function; let us examine where it has zeros and 
poles. By Corollary 5.17 each of the factors J(t)—J(a) participating in the product in (5.48) 
(where a = z; ora = wj for some j) has a simple zero at z = a and no other zeros. 
Therefore the zeros of f at z;,...,Z, and the poles of f(T) at w1, ..., Wg are precisely 
canceled out by the factors J(t) — J(z;) and (J (1) - J(w;))* in g, so the points Z4, .. . , Zn» 
W1»... Wg are not zeros or poles of h. No other zeros or poles at any other points of 
the fundamental domain that are not the special points ico, i, p are contributed by any 
multiplicand. Thus h may have zeros or poles at the three special points but nowhere 
else. 

In fact, there are no zeros or poles at the special points either, since by Proposi- 
tion 5.16 the factor (J(T) - 1) 2 has a zero of order li at i, which cancels out the zero of 
order u; off (T) at i; similarly, the factor J (T)! has a zero of order lp at p, canceling out 
the zero of the same order of f (T) at p; and, finally, the order of the zero of h(t) at ico is 


Hig + Hs Bank =0 


by (5.47). 

The conclusion is that h(t) is a modular function with no poles or zeros and is there- 
fore a constant by Corollary 5.14, that is, h = c with c € C. We have therefore shown that 
f(T) = cg(t), which is a rational function in J (T), as claimed. 


5.10 The J-invariant as a conformal map 


Another thing that makes J (T) a natural function is that it is a conformal map and eluci- 
dates the structure of the modular surface H/T as a Riemann surface. 


Theorem 5.19. The function J (T) is a biholomorphism between the modular surface H/T 
and the Riemann sphere Č. 


Sketch of proof. J(t) maps H to C but respects the equivalence relation induced by the 
action of the modular group T. Thus it induces a function (which, abusing notation 
slightly, we also denote by J) J : H/T — C. Adding the point ico, which gets mapped 
by J to the point co on the Riemann sphere (this is just the fancy Riemann surface way 
of saying J (tT) has a simple pole at ico, as we stated in Proposition 5.16), turns J into a func- 
tion from the full modular surface to the Riemann sphere. This function is holomorphic: 
this is reasonably obvious at a generic point of H/T but requires an explanation in terms 
of the Riemann surface structure of H/T at the special points T = i, e7/3 igo, To avoid 
an involved digression into Riemann surfaces, we omit the details. 

Moreover, we claim that the induced function is in fact a bijection and therefore 
a biholomorphism of Riemann surfaces. Indeed, Corollary 5.17 states that J(T) takes on 
any value a exactly once on D (or, equivalently, on H/T) in the sense of the weighted 
sum (5.40) over solutions of f(€) = a. For a = 0, this corresponds to the triple zero at 
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T = e*”/3 (which is the only zero; otherwise, the weighted sum would be greater than 
1); for a = 1, this corresponds to the double zero of J(t) —1 at t = i, which again must be 
the only solution to the equation J (T) = 1; for a = oo, this corresponds to the simple pole 
at t = ioo. For any other a € C, the equation J(t) = a must have at least one solution 
t € D, and since t is not one of the special points i, e”””, ico, it has weight w(t) = 1, and 
therefore (5.40) guarantees that it is the only solution. Thus J is a bijection. 


5.11 The classification problem for complex tori, part III 


We saw in Section 5.5 that the fundamental domain D is a natural index set for the fam- 
ily of biholomorphism classes of complex tori C/L. While this is satisfying at one level, 
it still leaves some room to complain that the fundamental domain is an oddly shaped 
region, with various identifications along its boundary induced by the action of T mak- 
ing its structure odder and still more mysterious. However, the result of the previous 
section clarifies things by showing that this structure is in fact simply that of the set of 
complex numbers, with the J-invariant acting as a conformal map translating between 
the two sets. Thus we arrive at the following result, which complements the results of 
Sections 4.15 and 5.5 and completes our solution of the classification problem for com- 
plex tori. 


Theorem 5.20 (Classification of complex tori; third part). The conformal map J" parame- 
terizes the biholomorphism classes of complex tori C/L in the following precise sense: for 
any z € C, denote by T,(z) the point in the fundamental domain D for which J (Tọ) = zZ. 
(Theorem 5.19 guarantees that T(z) exists and is unique.) Then the map z +> L, (z is a 
bijection between C and the biholomorphism classes of complex tori. 


Proof. Immediate from Theorem 5.5. 


Recall also that Theorem 4.25 established a bijection between the family of complex 
tori C/L and the family of elliptic curves €(g», g3). Thus Theorems 4.24, 5.5, and 5.20, 
which together formed our solution to the classification problem for complex tori, when 
combined with Theorem 4.25, also give a complete solution to the analogous classifica- 
tion problem for elliptic curves. 


5.12 Modular forms 


As Theorem 5.18 makes evident, the property of being a modular function is such a 
strong one that we end up with a fairly small collection of functions, the rational func- 
tions in J/(t), which, moreover, does not include most of the interesting functions we 
already encountered and which served as motivation for the much of the theory we 
developed so far in this chapter, such as the Eisenstein series and the modular discrim- 
inant. 
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Fortunately, the true richness and beauty of the theory starts to emerge once we 
expand our notion of modularity from modular functions to the more general concept 
of modular forms. For an integer £ > 0, we say that a function f : H — C is an entire 
modular form of weight ¢ if it is a pre-modular form, is holomorphic at ico (that is, the 
Fourier expansion (5.20) contains no terms with n < 0), and satisfies the condition 


at+b\ | e a b 
(Z) eao for all t € H, (¢ et (5.49) 


We say that f : H — Cis a weak modular form of weight £ if it is a weak premodular 
form and satisfies (5.49). Note that the notion of modular functions coincides with that 
of weak modular forms of weight £ = 0.° 

In practice, to check that a function is a modular form, it is sufficient and necessary 
to check that it is periodic and transforms in a certain way under the map T +> -1/T, as 
the next lemma explains. 


Lemma 5.21. A function f : H — C satisfies (5.49) if and only if it satisfies the functional 
equations 


f(t+)=f@, f(-ld = T f0). (5.50) 


Proof. Exercise 5.6. 


Another simple observation is that if f is a nonzero (weak or entire) modular form 
of weight £, the weight must be an even integer. This is necessary for the condition (5.49) 
to be self-consistent, since we can apply this relation with the group element (44) of T 
being equal to either (? 71) or (.% 4) (both representing the same Mobius transforma- 
tion T + -1/7), to get that 


-1 1 
pce) =f(=)=1() = cope = o, 
implying that either f is identically zero or £ is even. 

The following result is an analogue of Theorem 5.12 for modular forms and is of 
fundamental importance. 


Theorem 5.22 (The weight formula for modular forms). Let f : H — C be an entire mod- 
ular form of weight £ that is not the zero function. Then 


12) w=. (5.51) 
f(@)=0 


Here the summation extends over all zeros of f(t) counted with multiplicities, including 
the point ico if it is a zero. 


5 The logic behind not attaching the label “weak” to modular functions is that, as Corollary 5.14 shows, 
there is no useful notion of a “strong” or “entire” modular function. Nonetheless, this terminology is a 
bit inconsistent and a possible source of confusion to be aware of. 
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Proof. The proof involves a repetition of the calculation used in the proof of Theo- 
rem 5.12, where we consider the same contour integral (5.41) as we did in that proof. 
In the current setting, the modular transformation property (5.49) that generalizes the 
simple notion of modular invariance associated with a modular function will affect the 
calculation in a specific way, which needs to be carefully examined. We will not go over 
the full calculation again, but simply point out where the change happens, which is in 
the consideration of the contour integrals of f’ (T)/f (T) over the subcontours y, and y; of 
the overall integration contour G (refer to the proof of Theorem 5.12 for the definitions). 
Where previously we saw in (5.42) that the two integrals cancel led each other out, 
now there will be a residual effect from the factor t° appearing in the transformation 
property (5.50). Specifically, the version of (5.42) updated for the current situation is 


Ee rae ieee dp _ [22 ip | fap 


f (0) f(-1/p) p? f(p) 
Y5 -Y4 Ya Ya 
en 
f'(p) dp f'(p) mi 
== d TA = d €. 
[Fo pa l p [Fo Pe 


We leave to the reader to check that when the reasoning of the proof of Theorem 5.12 
is carried out again but with the new term zié/6 included, the result is precisely (5.51). 
(Note that another difference from the case of modular functions is that in the current 
setting, poles are not allowed, which means that when repeating the calculation from the 
proof of Theorem 5.12, all the terms associated with counting poles can be set to 0.) 


Theorem 5.22 gives us a powerful tool for understanding what sort of functions can 
be entire modular forms of different weights. We now aim to use it to classify the mod- 
ular forms of even weight £ = 2k for any k > 0. We start by answering this question for 
small values of k. 


Proposition 5.23. Let f(t) be a modular form of even weight £ = 2k < 10. Then: 
(a) Ife =0, thenf is a constant. 

(b) Ife = 2, then f is the zero function. 

(c) Ife {4,6,8,10}, then f is a constant multiple of G»,. 


Proof. The case £ = 0 is the case of modular functions without poles. In this case, we al- 
ready saw in Corollary 5.14 that the only functions with these properties are the constant 
functions. 

For the case £ = 2, note that by the definition of the weight function w(é) for- 
mula (5.22) cannot be satisfied with any possible (multi)set of zeros, as the smallest pos- 
itive contribution on the left-hand side can be 4, so f must be the zero function. 

Similarly, for other values £ € {4,6, 8,10}, formula (5.22) can be satisfied but only 
in very limited ways. Specifically, it is impossible to have any zeros at points other than 
t = i,e*"/>, since for such zeros, we have 12w(ë) = 12, which is too large. So we need 
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to consider for each value of £ different solutions in nonnegative integers a,b of the 
equation 


£ = 4a + 6b. 
Here a and b denote the orders of the zero of f(T) at t = e" and t = i, respectively. 

In the case £ = 4 the only solution is a = 1, b = 0, that is, f(z) must have a simple 
zero at t = e”/3 and no other zeros. By the same reasoning applied to the Eisenstein 
series G; instead of to f, G, as well must have a simple zero at T = e*7/3 and no other 
zeros. Therefore f(t)/G,4(T) is an entire modular form of weight 0 and hence a constant 
by part (a). 

In the case £ = 6, we get that a = 0 and b = 1, so f(t) must have a simple zero at 
tT = i and no other zeros. Again, the same conclusion must also apply to Gg, so f(T)/Gg(T) 
is an entire modular form of weight 0 and hence a constant. 

In the case £ = 8 the unique solution is a = 2, b = 0, so f(T) must have a zero of 
order 2 at t = e*™/? and no other zeros. Therefore f@/ Go)? is an entire modular form 
of weight 0 and hence a constant. Since G is proportional to Gg (see (4.19)), the claim 
follows in this case. 

Finally, in the case £ = 10, we get that a = b = 1, so f(t) has a simple zero at T = i, 
a simple zero at T = ei! 2 and no other zeros. Therefore f(T)/(G4(T)G¢(T)) is an entire 
modular form of weight 0 and hence a constant. Since we know from (4.20) that G4Gg is 
proportional to Gyo, this case is also proved. 


The next result characterizes all entire modular forms of an arbitrary even weight. 
This is best stated in terms of linear algebra. For k > 0, we define the vector space Myk 
(over the field of complex numbers, naturally) as the space consisting of all entire mod- 
ular forms of weight 2k. 


Theorem 5.24. (a) The vector spaces M2; are finite-dimensional. Their dimensions are 
equal to 


2k-2 . 

aa if2k = 2 (mod 12), 
dimM,=4 ~ 

L4] +1 otherwise. 


(b) A linear basis for Mz, is the set 
Ax = {G4(T) G(T)? : a,b € Z, a,b > 0, 4a + 6b = 2k}. (5.52) 
(c) Another linear basis for Myx is the set 
Bg = |en nao" :aeZ,0<as Ei 12a + 2k - 2} (5.53) 


with the notational convention that Gy = 1. 
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Proof. We prove part (c) (which also trivially implies part (a)) by induction on k. The 
base cases 2k = 0,2,4,6,8,10 form precisely the content of Proposition 5.23. For the 
inductive step, let 2k > 12. We claim that M,, is spanned by the set Bx. To show this, let 
f € Myx. Let a = f (ico) be the constant coefficient in the Fourier expansion for f. Then 
g(t) =f(t)-a wey is an entire modular form of weight 2k and satisfies g(ico) = 0. 
Therefore the function g(z)/A(z) is an entire modular form of weight 2k — 12, that is, 
an element of the space Mzg-12. By the inductive hypothesis it can be represented as a 
linear combination of the form 


a] 


oS È, CaGr4-12-s2al7)(E)" 


for some coefficients c,, where the sum ranges over all a > 0 for which 2k -12 - 12a > 0 
and 2k — 12 — 12a + 2. In terms of the original modular form f, this means that we have 
represented f in the form 


a 


HONE Gia) 


1 
Gx (T) + > CaGzk-12(a+1) DAT)", 
a 
which is a linear combination of elements of 6,. This proves that 6; spans Mo,. 
To establish linear independence, assume that we have a linear relation of the form 


2 CqGox-t2q(T)A(T)" = 0 


over the appropriate range of indices a. In particular, for T = ico, this implies that cy = 0, 
since G2; (ico) = 2¢(2k) + 0 (recall (5.23)). The remaining expression can be factored as 


A(t) È CaGo-t2a(T)A(7)** = 0, 


az1 


that is, 


$ CqGoK-12-12(a-1) AMT) = 0, 


a>1 


so by the inductive hypothesis, c, = 0 for all a. The proof by induction is complete. 

Finally, to prove part (b), since we already showed that 8, is a linear basis, it is 
sufficient to show that any element of 6, can be represented as a linear combination 
of elements of A, and that A; and Bp have the same cardinality. The second claim is 
left as an exercise (Exercise 5.7). For the first claim, use (4.22) and an induction to show 
that for any j > 2, Gy can be represented as a linear combination of terms of the form 
GIG}, where p,q = 0 are integers satisfying 4p + 6q = 2j (this is a slightly more precise 
version of Corollary 4.15). Then, taking 2j = 2k — 2a and using the fact that A is a linear 
combination of G and G, we see that Gyx_15,A* can similarly be expressed as a linear 
combination of monomials G4G? with 4a + 6b = 2k, as claimed. 
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Corollary 5.25. Any entire modular form can be expressed as a polynomial in G, and Gg. 


Proof of (5.36) and (5.37). We now revisit our earlier discussion about the Eisenstein se- 
ries identities (5.36)—(5.37) and the number-theoretic identities (5.34)—(5.35) they imply. 
The main thing to observe is that Theorem 5.24 reduces these identities and similar ones 
to essentially a triviality, since it represents an equality between elements of a finite- 
dimensional (indeed, very low-dimensional in the situation at hand) vector space, whose 
existence can be guessed based on simple linear-algebraic considerations, and whose 
precise form can be derived mechanically. 

The verification in the case of (5.36) is as follows: since the space M,, of modular 
forms of weight 14 is of dimension 1 and contains both G,, and G,G,0, there must be 
a linear dependence between these two modular forms, that is, a relation of the form 
G44 = cG,G,0 for some constant c. The value of the constant c can now be found simply 
by comparing the zeroth Fourier coefficient of the two sides of the relation. (You can 
check that this leads to c = 6/13.) 

The verification of (5.37) is similar but involves the two-dimensional space My, 
which contains the modular forms A, Gj, and G as elements. Again, because of our 
knowledge of the dimension of the space, we can deduce the existence of a linear 
dependence relation of the form A = aGy + bG? for some unknown constants a, b. Look- 
ing at the first two Fourier coefficients gives two linear equations for the coefficients 
a,b, which (again, you are encouraged to check) are easily solved to give the values 
a = 1200 x 1430 and b = -1200 x 691. 


Suggested exercises for Section 5.12. 5.6, 5.7, 5.8. 


5.13 Examples of modular forms 


We have already encountered some of the most important examples of modular forms, 

namely: 

1. The Eisenstein series Gx, k > 2, is a modular form of weight 2k. 

2. The modular discriminant A = g} — 27g is a modular form of weight 12. 

3. Klein’s J-function J = g3/A is a modular function and a weak modular form of 
weight 0. 


Although Corollary 5.25 guarantees that all modular forms can in fact be represented in 
terms of these known, “obvious” examples, other examples of modular forms sometimes 
appear “in the wild,” arising out of formulas that do not make it at all obvious that these 
functions are either modular forms or related toEisenstein series. Below we survey a 
few important examples. 


6 Moreover, many more examples come up in more advanced parts of the theory when we broaden 
the notion of what a modular form is to allow for functions that have nice transformation properties 
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5.13.1 Theta functions 


In our study of the Riemann zeta function in Chapter 2, we encountered the function 


co 


ot) = y gmt 


n=-00 


(see (2.15)) whose functional equation @(1/t) = vVt@(t) (Theorem 2.7) provides one of 
the standard ways of analytically continuing ¢(s) to a meromorphic function on C and 
proving its functional equation. This function is in fact a mildly disguised modular form 
(although of weight half, and under the action of a subgroup of T rather than the full 
modular group) and belongs to a much larger family of functions known as theta func- 
tions. Switching to the notation more customary to use in the theory of modular forms, 
we define functions 


6,(t) = X eiD r (5.54) 
n=—0o 

B= Y e, (5.55) 
n=—co 

a= X prer, (5.56) 
Nn=—0o 


We will refer to them as the Jacobi thetanull functions.’ 


Theorem 5.26. The functions 6;(T) satisfy the following transformation properties under 
the generators T,S of the modular group I: 


B(T +1) = e™/46,(r),  05(-1/t) = Vit 04(T), (5.57) 
03,(7+1)=  04(t), 03(-1/t) = V-it 43(7), (5.58) 
6,(t+1)=  @(T)  @4(-1/t) = V—it @,(z). (5.59) 


Proof. This is Exercise 5.9. Note that the relations involving 6;(7 + 1) are immediate from 
the definitions; the relation between 63(-1/7) and 6,(z) is the same as the transformation 
property 6(1/x) = /x6(x) discussed above from the theory of the Riemann zeta function; 
and the remaining relations involving 0;(-1/1) for j = 2, 4are proved using an argument 


with respect to only a subgroup of the full modular group or otherwise relax or generalize the various 
conditions a modular form is expected to satisfy. Here we focus mostly on the forms that are modular 
under the full action of T. 

7 The functions 6;(7) are also sometimes referred to as Jacobi theta constants or Jacobi theta func- 
tions. The term “Jacobi theta function” also denotes a more general function of two complex variables z 
and q, which specializes to our 6; under certain substitutions of the “elliptic” variable z. 
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involving the Poisson summation formula similarly to that used to prove the functional 
equation for 03(-1/T). 


Theorem 5.27. We have the following identities: 


4 
Gy = G +05 + 64), (5.60) 
n a2, pid 8/94, at 
Gs = 5g5(83 +84 —302(03 + 03)) (5.61) 
6 
1 
= TAK +05 + 6). < 54(8,0;0,)°)”, (5.62) 
A = 161" (0,050,)°. (5.63) 


Proof. Exercise 5.10. 


5.13.2 The modular lambda function 


Define the function A : H — C by 


€3(T) — e2(T) 


X(t) = i 
oe €4(T) — €2(T) 


(5.64) 
where e,(T), e,(T), e3(T) are the quantities derived from the Weierstrass g-function as- 
sociated with the lattice L = Z + TZ according to (4.25). The function A(T) is known as 
the modular lambda function. It is a modular form, although not quite of the ordinary 
kind we are used to work with. The next result adds more details. 


Theorem 5.28. (a) A(t) is a modular function under the action of the congruence group 
T(2) discussed in Exercise 5.4, that is, A(T) satisfies 


at+b a b 
(22) = 20) for ant (° 1) ETO. 


(b) Klein’s J-invariant can be expressed in terms of A(T) as 


(1-A+/°)8 


Proof. Exercise 5.11. 


The modular A function has interesting applications to parts of complex analysis 
that seem a priori unrelated to modular forms. The most well-known such application 
is its use in giving a slick proof of a deep result known as Picard’s theorem. 
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Theorem 5.29 (Picard’s theorem). Let f : C — C be an entire function such that two 
distinct complex numbers a, b are not in the image of f. Then f is a constant. 


The proof, although conceptually simple, involves a use of the monodromy theorem, 
which is outside the scope of this book. See [1, Ch. 8] for the details. 

Another appearance of the modular lambda function is in connection with a maxi- 
mization problem in the theory of conformal mapping of doubly connected regions. This 
is discussed in [2, Sec. 4.12]. 


5.13.3 The zeros of (z) and their modular properties 


Fix a lattice L = w,Z + WZ. In our discussion of doubly periodic functions in Chapter 4, 
we saw that both g(z) and its derivative g’ (z) have their poles at the points of L and that 
g'(z) has its zeros at the half-periods T 5 Wo, HON + w2). We also discussed that (z) 
takes every value twice in any fundamental parallelogram as a doubly periodic function 
of order 2. It might therefore seem like a curious omission that we never discussed the 
question of where the zeros of g(z) are located. In fact, the question of the location of 
the zeros as a function of the lattice L turns out to be quite nontrivial and gives rise to 
an interesting modular form. 

Let us denote the location of one of the zeros of ¢(z) by Z. This is a function of the 
lattice L, so we can write Z = Z(L) or 


Z=Z(T) 


if we switch to the notation involving the modular variable t taking values in the upper 
half-plane and representing the “canonical” lattice L} = Z + TZ, that is, the defining 
equation of Z(t) is 


Q(Z(t);T)=0 (te H). 


It is natural to think of Z(t) as a multivalued function of t in the sense that— 
similarly to the logarithm and kth root functions, we are familiar with from basic 
complex analysis—it takes its values in the quotient of the complex plane by some 
discrete group of symmetries. In our case the set of zeros of p(z) has two obvious sym- 
metries: it is L-periodic, and (since g(z) is even) it is invariant under reflection Z > -z. 
Thus Z(t) can be thought of as a function of t that is well-defined up to a translation 
by an arbitrary element of L and a sign change. Moreover, the location of any one zero 
of p(z) determines the location of all of its zeros, since if Z lies in some fundamental 
parallelogram, then either Z is a half-period and then must be of order 2 (in which case 
there are no other zeros in the parallelogram), or Z is not a half-period, is a simple 
zero, and is matched by another zero at the unique point in the parallelogram that is 
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congruent to -Z modulo L. That is, geometrically, the zeros come in pairs of points that 
are reflections of each other around the center of the parallelogram. 

It is worth keeping in mind that when we discuss multivalued functions, we are 
really talking about functions taking values in a certain Riemann surface. We will not 
explore this point of view in depth, but if you find it interesting, then try to think what 
the Riemann surface is in this case. 

The question of understanding the behavior of Z(t) seems to have been addressed 
for the first time in a 1982 paper by Eichler and Zagier [26], who derived a formula for 
this function. A more explicit formula was found in 2008 by Duke and Imaméglu [24]. 
It seems possible that the last word has not yet been said on this interesting and quite 
nontrivial problem. 

We present below without proof Eichler and Zagier’s result, which ties in a nice way 
to our current discussion of modular forms. 


Theorem 5.30. (a) The function Z(t) is holomorphic. 

(b) The function Z" (t)} is a single-valued function of t, that is, an ordinary holomorphic 
function on H. 

(c) The function Z" (t)? is a weak modular form of weight 6 for the modular group I. It 
is given explicitly by 


2 
Z" = -124 4160? AO 
E(t)’ 


(d) Z(t) can be expressed explicitly as 


7 1 / log(5 +26) A(p) 
zo =z rza 5 ( BES , Hans fo- OF (pie ies } 


5.13.4 Infinite products 


Modular forms often arise in applications in the form of certain types of infinite prod- 
ucts, where, again, the fact that the function expressed in such a way is a modular form 
is not easily apparent. This is the subject of the next section. 


Suggested exercises for Section 5.13. 5.9, 5.10, 5.11. 


5.14 Infinite products for modular forms 


One additional beautiful and somewhat mysterious aspect of the theory of modular 
forms is the fact that many modular forms that are commonly encountered in the theory 
have elegant representations as infinite products. It is not clear whether there is a good 
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conceptual explanation for why this happens so frequently [W20], or whether instead it 
is yet another vivid illustration of John von Neumann’s famous quip that “in mathemat- 
ics you don’t understand things. You just get used to them.”® Our goal in this section is 
to prove a few of the most well-known identities of this type. 


5.14.1 The modular discriminant 


The following result is one of the famous identities of modular form theory. 


Theorem 5.31. The modular discriminant A has the infinite product representation 
~ 24 i 
A(t) = (2n)""q [[@-a) (q= e7", teH), (5.65) 
n=1 


One reason why identity (5.65) is interesting is that it highlights an unexpected con- 
nection between the modular discriminant and integer partitions, since the function on 
the right-hand side of (5.65) is, up to trivial factors, the generating function of integer 
partitions raised to the power —24. The connection between modular forms and integer 
partitions goes much further than this single identity and has far-reaching consequences 
that go quite deep into the theory; you can learn about it in more specialized books, such 
as [5]. 

The existence of identity (5.65) is closely tied to yet another intriguing object, which 
we will now study, the weight 2 Eisenstein series G,. One motivation for introducing 
G, is that Theorem 5.24 suggests an annoying gap in the dimensions of the vector spaces 
M»,(T). Noticing this, we might wonder whether the definition of a modular form of 
weight 2 can be modified somehow to lead to some useful family of functions rather than 
the empty set and—which is related—whether formula (4.13) defining the Eisenstein 
series can be made to make sense for the exponent 2 through some simple modification. 
The answer to both these questions is “yes”; in fact, the modification to (4.13) is the most 
obvious one that one can think of and consists of replacing an absolutely convergent 
series by a conditionally convergent one. The next result explains what happens when 
such a modification is carried out. 


Theorem 5.32. Define the weight 2 Eisenstein series G, by 


= 1 
G(T) = | y) (5.66) 
2 2 (mt +n) 
(m,n)#(0,0) 


8 Von Neumann said this in response to a complaint from a colleague that he did not understand the 
method of characteristics [75, p. 208]. 
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(a) Expression (5.66) defines a meromorphic function G,(t) on the upper half-plane. 
(b) G, transforms under the actions of the generators T, S of T as 


G,(t + 1) = G,(t), (5.67) 
G(-1/t) = T°G,(T) — 2nic. (5.68) 


(c) G, is a premodular form with the Fourier expansion 


T p 
G,(t) = zhi- 24 > o,(n)q" ) (q =e", t €H). (5.69) 


Note that (5.69) is the case 2k = 2 of the Fourier expansion (5.23). Thus we see yet 
another way in which G, can be thought of as extending the definition of the original 
Eisenstein series Gç, K > 2, in the most natural way possible by a kind of “analytic con- 
tinuation” (very loosely speaking), that is, by taking one of the formulas that represent 
those series and simply observing that it continues to represent a well-defined object 
even in the case 2k = 2. 


Proof. (a) For m = 0, the inner sum in (5.66) is equal to 2¢(2) = 1/3. For m + 0, this 
inner sum can be summed using (5.25) as 


y g sah (7 cot(mmr)) = =o 
A (m+n? m dt ~ sin?(amT) 


It is now easy to see that the infinite series } m40 sin *(mmr) converges absolutely uni- 
formly on compacts in H (since | sin(z)| grows exponentially in | Im(z)|). Thus G,(z) is 
well-defined and holomorphic on H. 

(c) The calculation is essentially a repetition of (5.27): again using (5.25) and also (5.26), 
we have 


DL Smeal 


meZ neZ 
n n)+(0,0) 
2 co © 
T 1 

Se °F == +2 —— 

py 2 wn (mt+n2 3 2 oe (mt +n)? 

co s, 
ce © gn? y Ș eeren T ee 872 >( X ae 
m=1 ¢=1 n=1 \ @m>1 


T foe) oe 
=—(1-24Y a(ne J, 
a 2 a) 


as claimed. 
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(b) The first relation (5.67) is obvious from (5.69) and also easy to check directly 
from the definition of G,. Thus the main challenge is to prove (5.68). As in the proof of 
(c) above, we start by attempting to replicate the calculation that we used to prove the 
analogous property (5.3) (in the particular case of the transformation S(z) = —1/z) for the 
“proper” Eisenstein series. However, in this case, we are in for a surprise. Specifically, 
multiplying the left-hand side of (5.68) by T~? gives 


7 °G,(-1/t) = ¥ [£ gora aro 


at DZ let Zle 3 oa 


Comparing this to (5.66) and (5.68), we see that the proof of (5.68) reduces to showing the 
following curious rearrangement identity: 


DO aa r) j *. a 
m#0 `n n ‘m0 
In other words, what we have here is a naturally occurring example of a condition- 
ally convergent double summation for which changing the order of summation not 
only changes the value of the series (which can happen, as we know from calculus), 
but changes it in a predictable and rather interesting way.’ It is precisely this change 
that accounts for G, satisfying the “exotic” transformation property (5.68) (sometimes 
described as a “quasimodular” relation) rather than the more standard modular trans- 
formation relation satisfied by the other, absolutely convergent Gç. 

Denote the first double sum on the left-hand side of (5.70) by X and the second by Y. 
We have 


tS Graney e a) 
olam +n (mT+n+1) mt+n mt+ntl 


TT] J 2, Be mT =i): 


m#0 


=M 


2 


9 The examples illustrating this sort of order-dependence phenomenon in calculus textbooks often have 
a rather contrived feel to them. This one seems more natural. 
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A key observation here is that the first of these two double summations is absolutely 
convergent. Similarly, 


1 1 1 
v= >| e e ‘mean —_—)| 


m#0 
1 1 1 
=) |) Gear ae aan) | 


and thus, by absolute convergence, the difference X — Y is now seen to be equal to 


ah a) Sa a]l (5.71) 


m#0 m#0 


The first of these new double series is trivial to evaluate, since the internal summation 
is telescoping: we have 


2 men ae)! 
= 5 | m, $ co =m) 


m#0 n=-N 
=} | lim ( 1 )|- So. 
az LN—00\ MT — N aaa neo 


The second double series in (5.71) is only slightly more challenging. Write 


yy (a 3) 


n -m40 
N-1 
$ 1 1 
= jim sala ) 
Noo yk hao MT +N mt+n+1 


1 
mt+n aa) 


zL 
ezan) 
25m 


1 1 
mt —N — -mt -N =i) 


; g 1 1 
= lim 2 ¥ = 
N>% f T-N mt+N 


m + 
2 as z 1 1 
=-7 jim 2 ( Nom N ) oy 
m=1* 7 T 
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Appealing to the partial fraction expansion of the cotangent function in one ofits variant 
forms, 


1 & 1 1 
mcot(mz) =-+ ) + 
z &\z+m z-m 


(a trivial recasting of (5.24)), we see that the last expression in (5.72) is equal to 


View aN T 2m. aN 
—— lim | zcot = lim cot{ — |]. 
T N> T N T N-co T 


By a straightforward calculation (Exercise 5.12) this is equal to —2mi/t, and there- 
fore (5.71) is equal to 2mi/t, which is what we have reduced our claim to. The proof 
is complete. 


Proof of Theorem 5.31. Define A : H > C by 
x ROP ny24 
A(t) = (2)"*q[ [Q-4")-. 
n=1 


By Proposition 1.60 the product converges uniformly on compacts in H and defines 
a holomorphic function with no zeros, which, since it can be expanded as a series in 
powers of q with good convergence properties, is a premodular form. Our goal is to 
prove that A(T) = A(T), and this will pass through a curious relationship to the Eisen- 
stein series G,. Namely, the logarithmic derivative of A(z) is given by 


Rl (ee) + pint œo œ , 
E ong =a SE ani(1 -4Y nY oe) 
A(t) nai 1 


d e2mint oo pet 
co « 
= ani(1 -247 ( X ne 
mal S en 
. < 2nimt 6i 
= ani(1 -24 Ý o,(m)e ) = 7&0. (5.73) 
m=1 


We claim that this connection implies that A(z) is a modular form of weight 12. By 
Lemma 5.21 it suffices to prove that A satisfies 


A(-1/t) = TPA(t) (ct € HH). (5.74) 
However, the logarithmic derivative of the left-hand side is equal to 


dx 2 
a(ACl/t)) 1 A'(-1/t) 6i 
= mee E] 
AID CAIT) T ace) 
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12 


_ i 
2 T’ 


= (T°G(T) - 2nit) = Teo + 


which is the same as the logarithmic derivative of the right-hand side of (5.74). Therefore 
if we recall the trivial fact that iff, g are two meromorphic functions that are not identi- 
cally zero for which f' /f = g'/g, then f = cg for some constant c, then, together with the 
fact that (5.74) is satisfied for t = i, we deduce that (5.74) holds for general t € H. Thus 
A(T) is a modular form of weight 12. As such, it is an element of the vector space M,,(T), 
which we know (Theorem 5.24) is of dimension 2 and spanned by the original modular 
discriminant A(T) and the Eisenstein series G}. It follows that A = aA + BG, for some 
a,B € C. Comparing the constant and linear terms in the Fourier expansions of A, Gy, 
and A shows that a = 1 and £ = 0 and finishes the proof.’ 


The relation between A and G, that was obtained as part of the proof is of indepen- 
dent interest, so we note it as a corollary. 


Corollary 5.33. The functions A(t) and G,(T) are related to each other via 


A'(t) _ 6i 
Ato) = ze (teH). 


5.14.2 The modular lambda function 


In this and next subsections, we denote Q = e””. (This is the square root of the parameter 
q = e*"" we have been using throughout much of the discussions in this chapter and is 
more convenient for some expansions discussed below."") 


Theorem 5.34. The modular lambda function A(t) defined in (5.64) and the complemen- 
tary function 1 — A(t) have the infinite product representations 


E foe) 1 + on ) 
A(t) = 16Q H( irom) (5.75) 
ie gn 8 
1-A(t) = H( om) (5.76) 


with Q = e™, t € H. 


10 In fact, the constant coefficients of A and A are 0, which means they both belong to the codimension-1 
subspace of M,,(T) of forms with a constant coefficient 0. Such forms are known as cusp forms. So an 
alternative way of phrasing the argument above without mentioning Gy, is by saying that A must be 
proportional to A, since they are both cusp forms of weight 12, and since the space of such forms is one- 
dimensional and spanned by A. 

11 In many textbooks on modular forms, the letter q may be used alternately for either eM or e7" 
depending on the context, so pay close attention to the definitions when you read the literature. 
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Proof of (5.75). Fix t € H, and let L, = Z + TZ denote as usual the associated lattice 
with fundamental period pair w, = 1, w, = T, and let p(z) = p(z; L+) be the Weierstrass 
function of L,. Define the meromorphic function F : C — C by the expression 


fore) mi(2n—1)t—-27iz mi(2n+1)T-21iz 
7 (14+ e™! (+e ) 
F(Z) = I (1 + e2"int-2riz)2 j 6.77) 


n=-00 


Denote the nth factor in this two-sided infinite product by ¢, = & (z; T). The product 
of ¢,, over positive values of n clearly converges absolutely, uniformly as z ranges over 
compacts in C away from any poles of individual factors, due to the exponential decay of 
\e"""|. Moreover, &, has the symmetry (_,,(-z;T) = ¢,,(z; T), which is easy to check, imply- 
ing the same convergence also for the product over negative n. Thus F(z) is well-defined 
and is a meromorphic function with poles only at places where one of the individual 
factors ¢, has a pole (more on that below). 

The usefulness of F(z) is related to the fact, which we now observe, that it isa doubly 
periodic function with period lattice L}. This is easy to see: the relation F(z + 1) = F(z) 
holds trivially, and to show that F(z + T) = F(z), observe that the substitution z > Z + T 
maps each factor ¢,, to its predecessor ¢,,_, that is, we have the relation ¢,,(zZ + T; T) = 
Gn (Z T). 

Next, an examination of the factors involved in the definition of ¢,, and their zeros 
(as a function of z with fixed T) reveals that F(z) has double poles at the half-period 
Z = vı = 1/2 (in the notation of (4.23)) and all its L,-translates, and double zeros at the 
half-period z = v3 = (1+ T)/2 and all its L,-translates. There are no other zeros or poles. 
This means that in fact F(z) has the same zeros and poles as the doubly periodic function 
ne ae Therefore the quotient F(z) (ee = is a doubly periodic function with no poles 
and so must be a constant. Taking z = 0 shows that the constant is equal to F(0) (the 
limit of ea e as z > 0 is 1 because of cancelation of the principal parts of the poles of 


(z)- 
the hümerator and denominator at z = 0; refer to (4.10)). So we have shown that 


pai 


F(z) = F(0 
(z) = F( rose 


Now set Z = v, = T/2 in this identity to get that 


OMI L FETS = Foo. 


F(t/2) = F(v,) = F(0 
@(V2) — e, — ey 


In other words, we have shown that the lambda function can be represented in terms 
of F(z) as A(t) = F(t/2)/F(0). Making the relevant substitutions into (5.77), we see that 


il (1 + eind + eind) 
is (1 $ e2mint)2 


_ 1+ Q0+Q) 
> 4 


F(0) = 
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7 Il (1 +4 dimen el 4 Oya 4 gD) 4 Ore 
n=1 (14+ Q2)2(1 + Q-27)2 

E (1+ Q)? co (1+ oy a+ gaye 7 1 

E 40 wi dso" 7 4Q 


(1 4 omy 
a+ ent 


— s 


Il 
B 


n: 


Similarly, 


(oe) mi(2n—2)T mi(2n)t 
7 (+e \it+e ) 
P(t/2) = Il (1+ emi(2n-1)r)2 


n=-oo 


je Oe Oe ee 
(1+ g2n-1)2(4 re Q-@n-Dy2 
E fee) (1+ Oris ony E j co (1+ oy 


ra (1+ gn-l)4 a (1+ QDE 


Combining the above results yields precisely the infinite product formula (5.75). 


Proof of (5.76). Exercise 5.15. 


5.14.3 The Jacobi thetanull functions 


Our final result on infinite product expansions concerns the Jacobi thetanull functions. 


Theorem 5.35. The Jacobi thetanull functions have the infinite product representations 


B(T) = 294 Ta =O 40"), (5.78) 
n=1 
64(t) = [Ia -QAF (5.79) 
n=1 
a,(t) =] ]a-e™a-erty, (5.80) 
n=1 


with the usual notation Q = e™, t € H. 


As a corollary of (5.75), (5.76), and (5.78)—(5.80), we obtain two additional remarkable 
identities relating A(T) to the Jacobi thetanull functions. 


Corollary 5.36. The modular lambda function A(t) satisfies the relations 


Az) = ( fe) 1 iz) = ( ae) (5.81) 


Additional interesting corollary worth noting is the following. 
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Corollary 5.37. The Jacobi thetanull functions satisfy the identity 
B(T) + 0,(t)* = OTt. (5.82) 


The infinite products (5.78)-(5.80) are particular cases of a more general product 
identity for the full Jacobi theta function (involving two variables z and T), known as 
the Jacobi triple product identity. 

Theorem 5.38 (Jacobi triple product identity). We have the identity 
>: exp(sin’r + 2minz) 


n=-00 


= Ta A aes 4 elr-Dnir+2niz) (1 4 gee (5.83) 


fort < H andz € C. 


For a complex-analytic proof of identity (5.83) using techniques of a flavor similar 
to those used in the proof of Theorem 5.34; see [66, Ch. 10]. An alternative approach 
proceeds by rewriting (5.83) as 


[oe] o0 


X xy" = Jia = V1 +yx 4 +y) 


n=-00 n=1 


(by making the substitutions x = e™”, y = e*””) or, equivalently, 


FT 1 S r 7 2n-1 1, 2n-1 

nmin n- = n—- 
Iie $ x"y =| [G+yx \Qtyox"™). 
n=1 N=—Co n=1 


This can be given a combinatorial proof by interpreting both sides as bivariate gener- 
ating functions for certain classes of objects associated with integer partitions. These 
classes are then shown to be in explicit bijection with each other, implying the equality 
of the coefficients at x/y* on both sides of the equation for all j, k. See [54, Sec. 6] for 
details. 


Proof of Theorem 5.35. Exercise 5.17. 


Suggested exercises for Section 5.14. 5.12, 5.13, 5.14, 5.15, 5.16, 5.17, 5.18, 5.19, 5.20, 5.21. 
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Exercises for Chapter 5 


5.1 


5.2 


5.3 


5.4 


Show that the Weierstrass g-function, regarded as a function g(z; T) of both the “el- 
liptic” variable z and the modular variable T, satisfies the transformation relation 


of = a at" =(cT+ d}? gz, T) (z€C, teH) (5.84) 
for all a,b,c,d € Z for which ad - be = 1. 

Prove Lemma 5.3. (Hint: reminding yourself of the statement of Theorem 3.13 from 
Chapter 3 might be helpful.) 

Structure of the modular group. Prove that the algebraic structure of the modular 
group T can be expressed succinctly by the relation 


I =Z; * Z3. 


In words, this says that T is isomorphic to the free product of the cyclic groups of 
orders 2 and 3. More precisely, show that it is freely generated by the elements 
S, U, that is, that if the standard cyclic groups Z, and Z; have respective generators 
denoted y, and yz, then the map 


9:2, * Z3 >T 
(where Z, * Z, denotes the free product of those groups) defined by 


P(¥2) =S, P3) = U, 


and extended in the obvious way to a group homomorphism is a group isomor- 
phism. (Note: this is a well-known result. A simple proof is given in [4].) 
The congruence subgroup [(2). Let 


T(2) = fa = (e A cT : a,dare odd, b,c are event. 


It is easy to see that I'(2) is a subgroup of the modular groupT either through direct 
verification or by noting that T(2) is the kernel of the homomorphism that sends 
any matrix A in to its reduction mod 2, an element of the matrix group SL(2, Z,). 
The group T(2) belongs to the class of subgroups of T known as the congruence 
subgroups. 

(a) Prove that the two matrices 


A=() z) and B=(; SEY (5.85) 
01 21 


generate ['(2). 
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-1 1 


Figure 5.3: The fundamental domain G for the congruence subgroup T (2). 


5.5 
5.6 
5.7 


5.8 


5.9 
5.10 


5.11 


(b) Prove that T(2) is freely generated by A and B. That is, the only products we 
can form from A, A™!, B, and B™ that give the identity element are those that 
reduce to the identity element by successively canceling out the appearances 
of AA, AA, BB”, and BB. 

(c) Prove that the set 


g= fzem : -1< [Re] <il- >1, z+ >1} u{0) 


(Fig. 5.3) is a fundamental domain under the action of T(2) in a sense that you 
should formulate precisely as an analogue of the statement of Theorem 5.4. 
(d) Find the index [T : r(2)]. 
Prove Theorem 5.10. 
Prove Lemma 5.21. 
Fill in the missing detail in the proof of Theorem 5.24 by proving that | Axl = |B;| 
for all k > 0, where Ax and 6, are defined by (5.52)—(5.53). 
Write a computer program to generate the change of basis matrices (in both direc- 
tions) between the two linear bases Ap and 6; for the vector space Myx described 
in Theorem 5.24. Investigate these matrices for small values of k and see if you can 
work out a formula for them that is valid in the general case, or find other interest- 
ing patterns. 
Prove the transformation properties (5.57)—(5.59). 
Prove Theorem 5.27. The idea is to show that each of the functions on the right-hand 
sides of (5.60)—(5.63) has the right structural properties that make it an element of 
the space M,, for an appropriate value of k, then conclude that it is a constant 
multiple of the function on the left-hand side, and finally find a way to determine 
the value of the constant. 
Prove Theorem 5.28. 
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5.12 Prove that if z € C \ R, then 


. -i if Im(z) > 0, 
lim cot(Nz)= 4 | 
N-oo i if Im(z) < 0. 
5.13 Prove that G, satisfies the general transformation relation 


G,( =) =(CT+ d)’G,(t) — 2mic(ct + d), F a cT, 

ct+d c 

under the action of the modular group. 

5.14 Prove that G, (i) = 7. 

5.15 Prove the infinite product formula (5.76) by applying a similar technique to that 

used in the proof of (5.75). 

5.16 (a) Enter a truncated version of the infinite product formula (5.75) into a computer 
algebra system of your choice, to obtain the first 10 coefficients in the Q-series 
expansion of the modular A function. 

(b) Enter the first few coefficients into the search box on the On-Line Encyclopedia 
of Integer Sequences [W21]. If you have the correct coefficients, then the search 
results will show you a lot of additional information and references on this 
sequence of numbers and on the modular lambda function. (You can also try 
doing the same with the Fourier coefficients for A, the Eisenstein series, or 
other sequences of integers that you encounter in modular forms or any other 
area of mathematics.) 

5.17 Show how to derive formulas (5.78)—(5.80) from (5.83). 


In the exercises below, we define renormalized versions of the Eisenstein series Gy, G4, Gg 
by 


E,(t) = = Gx(0) =1-24) amq", (5.86) 
n=1 
45 S i 
Ex(t) = 3641) = 1+ 240 2 o3(n)q", (5.87) 
E,(t) = sa Gol) = 1-504 )' o;(n)q". (5.88) 
n=1 


These versions of the Eisenstein series are often used in the literature in connection with 

number-theoretic applications. 

5.18 (a) Prove that E,, E,, and E, satisfy the following system of differential equations, 
known as Ramanujan’s identities: 


1 
2m 


E(t) = = (E,(t)* - E,(t)), (5.89) 
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aE ACs L (EDELT) - E;(T)), (5.90) 


a 
raoa LEDET) - E,(t) 3, (5.91) 
(b) For each of identities (5.89)-(5.91), find the Fourier expansions of both sides 
and compare the coefficients to obtain interesting number-theoretic identities. 
5.19 Prove the identities 


E,(t) = 5 (48200) +85 


wia > 
+ 

ty 

N 

aN 

a 

q s|4 
eS 

n—_" 
— 


E,(t) = 5 (1884020 + E,( 


5.20 Prove the identities 


ser- e{}e) om 
6,(r)* = 3( 42: (2r) - (5 )} (5.93) 
( 


6,(t)! = : 4E,(2t) - a(==)} (5.94) 
o = É (EO - E20), 66.95) 
63(z)® = 5 (1884(0)- (+ z1), (5.96) 
6,(z)® = EG -£,(5)). (5.97) 


Guidance for proving (5.92)-(5.94). Define the functions 


E=) - E,(5) 


R,(t) = 30,07 
4E,(21) - E (5) 

R3(T) = re 
4E,(2t) - EF 

R,(T) = O 


ON =R, + R3 + Ry, 
2 = RoR3 + R3R4 + R3Ry, 
3 = RoR3Ra. 


Show that $1, @, 63 are entire modular forms of weight 0 and use this to show that 
Q1 = 3, Q; = 3, Q; = 1. Deduce from this that R, = R} = R, =1. 
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5.21 Show that by expanding both sides of (5.93) and (5.96) as Fourier series and compar- 
ing the coefficients, we will obtain interesting number-theoretic identities related 
to counting the number of ways in which integers can be represented as a sum of 
squares. Specifically, let r4(n) and rg(n) denote the numbers of ways to represent 
an integer n as a sum of 4 squares and as a sum of 8 squares, respectively. Prove 
the following identities, due to Jacobi: 


r(nh=8 È d 
d\n, 44d 
rg(n) = 16(-1)" X CDİ. 
d|n 


(In particular, we get the fact that every integer can be expressed as a sum of four 
squares, a famous result in number theory proved by Lagrange in 1770.) 
5.22 Use (5.92)-(5.93) to prove that 


203(t)* — 0,(t)* 


foe) 
jet 
s =1+24 Ý Togan) e", (5.98) 


n=1 


where d,,qq(n), the odd divisor function, is defined by 


Ooaa(n)= $ d (n21). 
e 


5.23 Use the Jacobi triple product identity (5.83) to derive the following identity, known 
as the Euler pentagonal number theorem: 


(1-x") = § eyso (Ix| <1). 


k=-00 


18 


= 
Il 
B 


6 Sphere packing in 8 dimensions 


Discovery in mathematics is not a matter of logic. It is rather the result of mysterious powers which 
no one understands, and in which unconscious recognition of beauty must play an important part. 
Out of an infinity of designs, a mathematician chooses one pattern for beauty’s sake and pulls it 
down to earth, no one knows how. 


Marston Morse, “Mathematics and the arts” (1959) 


6.1 Motivation: the sphere packing problem in d dimensions 


In 1611, two years after publishing the first two of his famous laws of planetary motion, 
the astronomer Johannes Kepler also published a curious observation about geometry 
in an essay titled “On the Six-Cornered Snowflake.” Kepler speculated that the most ef- 
ficient way to pack solid spheres of equal size in three-dimensional space was using 
the lattice arrangement now known as the face-centered cubic (Fig. 6.1). This packing 
results in a packing density—the fraction of the volume of the packed space occupied 
by the interior of the spheres—of Pe and Kepler’s conjecture was the statement that 
no other configuration of spheres can achieve (in a limiting sense when this is done 
over larger and larger volumes that fill up space) a higher packing density. Although 
intuitively plausible, even obvious-sounding to anyone who has tried to stack oranges 
or other spherical objects, the conjecture nonetheless proved extremely resistant to at- 
tempts by mathematicians over the ensuing centuries to prove it rigorously. In the late 
twentieth century, it stood as one of the most famous and longest-standing open prob- 
lems in mathematics (among other markers of status, it was included as part of the 18th 
problem on Hilbert’s famous list of 23 problems) and was finally proved by Thomas Hales 
[39] in 1998. 

We will not discuss Hales’s proof, which is very involved and does not use complex 
analysis; the book [38] is a good reference on this topic. However, it turns out that sphere 


(a) (b) 


Figure 6.1: The Kepler conjecture, proved by Thomas Hales in 1998, states that the highest density for 
packing spheres in R is 77/3 V2. The packing density for the two lattice packings: (a) the cubic close pack- 
ing (derived from the lattice known as the face-centered cubic) and (b) the hexagonal close packing. 


@ Open Access. © 2023 the author(s), published by De Gruyter. [(€-)EXZX@XEMi] This work is licensed under the Creative 
Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. 
https://doi.org/10.1515/9783110796810-007 


234 —— 6 Sphere packing in 8 dimensions 


Figure 6.2: The hexagonal packing is the densest way to pack unit circles in the plane. 


packing is extremely interesting to study in other dimensions as well (where “spheres” 
now refer to hyperspheres of appropriate dimension, and the meaning of “packing” re- 
mains the same). For example, the case of sphere packing in two dimensions (that is, 
circle packing) is also interesting, though it is much simpler to understand than in three 
dimensions and has as its solution the hexagonal lattice packing with a packing density 
of i (a fact that was shown, in increasing levels of generality and rigor, by Gauss in 
1831, Thue in 1890, and Tóth in 1940); see Fig. 6.2. Much research in recent decades has 
focused on studying the question in dimensions higher than 3; see [18]. 

Our goal in this chapter is to explain the remarkable mathematical ideas behind the 
recent solution of the sphere packing problem in dimensions 8 and 24. These are cur- 
rently the only dimensions apart from d = 2,3 for which the problem has been solved. 
Specifically, we will give a detailed proof of Viazovska’s theorem. 


3 r ; : Sy 8. nÊ 
Theorem 6.1 (Viazovska’s theorem). The optimal sphere packing density in R? is g7 


Theorem 6.1 was proved by Maryna Viazovska [71] in 2016.' Following the appear- 
ance of her groundbreaking paper, Viazovska’s new insights led within days to a success- 
ful solution of the problem in dimension 24 by her and her collaborators Cohn, Kumar, 
Miller, and Radchenko [16]. In 2022, Viazovska was awarded the Fields Medal for these 
remarkable achievements and for further contributions to related problems in geome- 
try and Fourier analysis. For more details, see [12, 13, 20, 52]. 

One of the remarkable aspects of the solutions to the sphere packing problem in 
both dimensions 8 and 24 is that they use very little geometry: in fact, what little geo- 
metrical reasoning appears only does so in connection with the explicit constructions 


1 This statement (and our name for Theorem 6.1) are simplifications: this theorem summarizes the re- 
sults and contributions of several mathematicians. However, in this writer’s opinion, Viazovska’s contri- 
bution being the last, as well as being inarguably most ingenious and remarkable, makes her deserving 
of being the eponym of the theorem. 
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of the optimal packings (which imply lower bounds for the packing density), whereas 
the proof of the matching upper bounds for the packing density instead draws primarily 
on complex analysis and the theory of modular forms, spiced up with a bit of Fourier 
analysis. If you have read Chapter 5, then you are well equipped to tackle this modern 
and quite beautiful application of complex analysis. 


The E, lattice and sphere packing 


The Eg sphere packing is a packing in which each of the spheres of the packing is centered at a vertex 
of the so-called Eg lattice, a lattice with many remarkable properties that is closely associated with (and 
shares a notation with) the exceptional Lie algebra Ez. 

As the Eg packing is an intrinsically 8-dimensional object, it is somewhat difficult to visualize what 
the packing “looks like.” One can nonetheless gain some understanding of the qualitative behavior of the 
packing by considering what a single “cell” of the packing looks like — that is, a single sphere centered 
at the origin together with the spheres of the packing that are tangent to it. In the case of the Eg packing, 
there are 240 such tangent spheres. Each of the tangent spheres is itself tangent to 56 of the other 239 
spheres. This is visualized in the figure below, where the 240 spheres are represented as dots, and two 
dots representing spheres that are mutually tangent are connected with a line. (The positions of the dots 
are given by a particularly symmetric two-dimensional projection of their sphere centers in RË.) 

A formal construction of the Eg lattice is given in Section A.7 in the appendix. 


Figure 6.3: A two-dimensional projection of a packing cell in the Eg sphere packing, which realizes the 
optimal sphere packing in RË, having a packing density of 7*/384. 
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6.2 A high-level overview of the proof 


To understand the proof of Theorem 6.1, a bit of background is required to set up the 
problem for the final part of the proof, the part that involves complex analysis and is of 
main interest to us. Our presentation is self-contained and is split between this chapter 
and Appendix A. Here we give a brief overview of the full structure of the proof: 


1. 


Background material: this consists of definitions and basic facts about sphere pack- 
ings and lattices. This material is presented in Sections A.1-A.6 of the Appendix. 
Lower bound: construction of an optimal packing. An 8-dimensional sphere 
packing now known to be optimal is the E, sphere packing and is based on the 
E; lattice; see the box on the next page. A few basic facts about this lattice will be 
needed, and we discuss the relevant material in Section A.7 in the Appendix. This 
is the “easy” part of the proof (at least in the sense that it is based on little more 
than elementary linear algebra), which gives a lower bound on the optimal sphere 
packing density. 

Upper bound, part I: the Cohn—Elkies bounds and magic function conjectures. 
Conceptually more difficult is to prove an upper bound on the packing density, as 
that involves proving that no packing can have a density better than some number. 
Since the family of possible packings is very large (in fact, infinite-dimensional), it is 
not obvious how to approach this. A beautiful technique for deriving upper bounds 
was introduced by Cohn and Elkies [14], who discovered that the Poisson summation 
formula from harmonic analysis (more precisely, a multidimensional version of it 
for lattices) is just the right tool for the task. Their bounds, belonging to a class of 
bounds knownas linear programming bounds, give a way of associating a numerical 
upper bound for the packing density with certain functions of a single (real) variable 
with nice properties. The problem then becomes that of optimizing the bound over 
the relevant family of functions in the hope of producing a sharp bound. 
Amazingly, the numerical calculations Cohn and Elkies performed for many differ- 
ent values of the dimension d, which gave numerical bounds that were in many 
cases better than those previously known, revealed that for d = 2, 8, and 24, their 
bounding technique seems to approach the value known (in the case d = 2) or be- 
lieved at the time (in the cases d = 8 and 24) to equal the optimal packing density. 
They conjectured that in those dimensions, there exists a so-called “magic function,” 
a function in the class of bounding functions for which the associated upper bound 
for the optimal sphere packing density matches the known lower bound and hence 
serves as a certificate that solves the sphere packing problem in that dimension. 
We explain the Cohn-Elkies bounding technique and their magic function conjec- 
tures in Sections A.8—A.11 of the Appendix. 

Upper bound, part II: Viazovska’s modular form construction. Cohn and Elkies’s 
work reduced the sphere packing problem, at least in dimensions 8 and 24, to the 
problem of constructing a magic function. Viazovska discovered just the right tech- 
nique for constructing the function with the desired properties in dimension 8 (and 
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her ideas proved also applicable to dimension 24 with minor modifications) by mak- 
ing an ingenious use of modular forms. Explaining the details of her construction is 
the main goal of this chapter. 


To the reader who is completely unfamiliar with the topic of sphere packings and wishes 
to gain a full understanding of the proof of Theorem 6.1, a recommended path is to read 
Appendix A first and then proceed to reading the remainder of this chapter. Section A.7, 
which only deals with the explicit construction of the Eg lattice, is not necessary to 
understand any other parts of the proof and may be skipped on a first reading. 


6.3 Preparation: some remarks on Fourier eigenfunctions 


From here on, we assume that you are familiar with the material and notation of Ap- 
pendix A. The starting point for our proof is Theorem A.29, which, as explained in Sec- 
tion A.12, provides a kind of roadmap for constructing an E magic function, based on 
constructing separately the Fourier-even and Fourier-odd components ®, (r) and ®, (r) 
associated with a hypothetical radial magic function; these functions will be constructed 
with the goal of manufacturing (+1)-Fourier eigenfunctions having the prescribed set 
of zeros (of appropriate orders) at vV2n, n = 1,2,.... Once these functions are con- 
structed, they can be combined into a single radial function having the two functions 
as its Fourier-even and Fourier-odd components. The hope is that for the function thus 
constructed, the necessary conditions of Theorem A.29 will also turn out to be sufficient. 

Thus, forgetting about magic functions for the moment, our immediate goal is to 
construct radial Fourier eigenfunctions in 8 dimensions with the correct set of zeros. 
We will prove the following result. 


Theorem 6.2. There exist radial Schwartz functions 9,,_ : RÈ — R with the following 
properties. 
1. (xX) is a (+1)-Fourier eigenfunction, that is, 


Flo] =} 


where F; denotes the Fourier transform in 8 dimensions (see the definition in (A.5)). 
2. @_(Xx) is a (-1)-Fourier eigenfunction, that is, 


Fglp_] = -9_. 


3. Each of the radial profiles 0, (r), g(r) has zeros atr = V2n, n = 1,2,3,..., with the 
zero at V2 being simple and the other zeros being of order 2. 


Where do we begin to look for such functions? Well, probably the most famous ex- 
ample of such an eigenfunction is the Gaussian function 
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y(x) = eT 
for which it follows trivially from the analogous property of the one-dimensional Gaus- 
sian, 


Fa(y)(y) = yQ). 
This will be useful to us in the following way: if we let 


yx) = six 
(xX) = 
denote a rescaled Gaussian, then because of the scaling behavior of the Fourier trans- 
form, we have 


1 
Fel¥sl) = Grays). (6.1) 


This identity is valid not just for a real positive scaling parameter s, but in fact for any 
s in the half-plane Re(s) > 0, since in that case, y,(x) has good decay and integrability 
properties. 

Thus we see that the rescaled Gaussian y, is not a Fourier eigenfunction if s + 1, but 
a linear combination of y, and y,,, of the form ay, + by,/; with a, b satisfying 


a = +s'b (6.2) 


is an eigenfunction (associated with eigenvalue +1 according to the choice of sign 
in (6.2)). More generally, we can take sums of such linear combinations involving differ- 
ent values of s, or even integrals with respect to s of the form 


f(x) = | w(s)y,(x) ds = | w(sje Pl" ds, (6.3) 


where w(s) is some weight function, and where the integration is taken over some range 
of values of s in the half-plane Re(s) > 0. Under appropriate assumptions over how w(s) 
relates to w(1/s), the resulting function will be a Fourier eigenfunction. This gives a rich 
source of potential eigenfunctions to use for our construction. 

It seems most natural to choose the interval (0, co) as the range for the integration 
in (6.3); the integral (6.3) can then be thought of simply as the Laplace transform 


| w(s)e "™ ds, (6.4) 
0 


in the variable z = |ix||?. In that case the weight function will need to satisfy w(1/s) = 
+s~*w(s), a condition reminiscent of one of the defining equations for a modular form. 
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However, this is too naive of an idea and does not work, as it does not lead to a viable path 
to choosing the weight function w(s) in a way that causes the function f(x) to have zeros 
at the desired radii. It turns out that a more clever choice is required that also incorpo- 
rates certain nonreal values of the scaling parameter s (see equations (6.31) and (6.54)). 
Modular forms still enter the picture, but they do so in a much more subtle and surpris- 
ing way. The details are given in the next two sections. 


6.4 The (+1)-Fourier eigenfunction 


In this section, we complete half of the proof of Theorem 6.2 by constructing the func- 
tion g,(x) and establishing its properties. The construction for g_(x) is given in the next 
section. Both the functions @, (x) and @_(x) are constructed by taking the Laplace trans- 
form of two functions U : H —> Cand V : H —> C, which are given explicitly in terms 
of modular forms. 

Let t be a complex variable taking values in the upper half-plane, and let q = e 
as in Chapter 5. We will use the normalized versions E, and E; of the Eisenstein series 
G, and Gs defined in (5.87)—(5.88). With these definitions, it is useful to observe that 


27iT 


1728 


E,(t)? — Esl)? = ani 


A(t), (6.5) 


a scalar multiple of the modular discriminant (see (4.15), (4.35)). 
Now define the function U(t) by 


(tE,(t) + 4E4(t))? 
E,(t)° a E,(t)* f 


U(T) = 108 (6.6) 


This can be expanded in the form 


ELT Ja? + 864( Ex(t)Eq(t) 
E,(t)3 — E,(t? E,(t)3 — E,(t) 


ET) ) 


U(t) = 108( 


which will be convenient for certain calculations and highlights the structure of U(r) as 
a kind of “polynomial” in t whose “coefficients” are themselves holomorphic functions 
in T that have useful modular properties and in particular are 1-periodic. 


Lemma 6.3. The function U(r) takes real, nonnegative values on the positive imaginary 
axis. 


Proof. Referring to (5.87)-(5.88), it is evident that E,(t) and E,(t) take real values on 
the positive imaginary axis and that E4(r) takes imaginary values there. Therefore (6.6) 
implies that U(t) is real for T = it, t > 0. Moreover, in the fraction in (6.6), the numerator 
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is the square of a real number (hence nonnegative) for t = it, and the denominator 
is a positive scalar multiple of A(it), which is itself a positive real number, as can be 
seen, e. g., from the infinite product representation (5.65). Combining these observations 
shows that U(it) > 0 for t > 0. 


Lemma 6.4. U(t) satisfies the transformation properties 


1 1 
u(-2) = yar +1) - 2U(t) + U(r -1)), (6.8) 
o( Ti 1) = U(r -1), (6.9) 
T T 
u(-- z 1) = <u +1). (6.10) 
T T 


Proof. Start by noting that 
E!(-1/1) = AET = eie (-1/1)) = PiE (1)) 
= "Ae Sdr S ~o dt * 
= T (T E,(T) + 4TE,(T)) = T (TE4(T) + 4E4(T)). 


It follows that 


((-1/7)E4(-1/t) + 4E,(-1/7))" 
E,(-1/t)> - Eg(-1/t)? 
š ((-1/t)t°( TE, (7) + 4E4(T)) + 41E, (T)? 
T?(E,(t)3 — Eg(t)?) 
1. EL? 
T2 E (T) — Elt) 


U(-1/t) = 108 


=10 


= 108 (6.11) 


On the other hand, by (6.7) and the comment above about the parenthesized expressions 
in that representation being 1-periodic, the discrete second difference U(t +1) —2U(t) + 
U(t — 1) on the right-hand side of (6.8) is easily seen to be 


gE! 
108( > IG +1} -27° + (T -1)°) 
E} - E? 


EE, ) E E 

rea a p ((T+1)-2T+(T-1)) 
2 

+1728( Ey )a-2+9 


3 2 
Ey - Eg 
7 E? 7 EY 
=2-108( —*— }+0+0=2-108—4_. 
E; - Eg E; — Eg 


This last expression by (6.11) is equal to 2r°U (-1/T). This proves (6.8). 
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Next, if we denote U(r) = t7U(-1/t), then by (6.11), U(r) is also 1-periodic. Using this 
fact, we can write 


yj ey ee ee fe 
- of je 1)° = U(t -1), 


which proves (6.9). Finally, (6.10) is obtained by substituting —1/t in place of t in (6.9). 


Lemma 6.5. On the positive imaginary axis near T = ico and T = 0, U(t) has the asymp- 
totic behavior 


U(it) = e™ — 2407t +504+0(t?e°") (t > oo), (6.12) 
U(it) = O(t?e 7") (t > 0). (6.13) 
Proof. Using (5.87)—(5.88), the initial terms of the Taylor expansions (in powers of the 


variable q) of each of the parenthesized expressions in (6.7) can be readily obtained, 
giving the asymptotic relations, as T — ico and q — 0, 


ED _ 4007” ; 
ET -ET 3 +0(q°), (6.14) 
E DE,(T) 5ni 
E,(t)3-—E,(t)2 18 + 0(9), (6.15) 
2 
E,(T) 1 z 7 a a6 


E-E)  1728q 24 


Substituting these relations into (6.7) gives (6.12). To get (6.13), use (6.11) together 
with (6.14) to get that, as t > 0, 


ELGO? s 400° 


NES: Py) = 
U(it) = 108(-it) E,(i/t)3 — E,(i/t)? — i 


g pent es olte), 


which proves the claim. 


Now define the holomorphic function 


ioo 


A(z) = 4i sin’( =) | U(ne"™ dr, (6.17) 
0 
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where the integral is a contour integral along the positive imaginary line. The motiva- 
tion for this definition is that ọ, (x) will later be constructed by substituting ||x||? for z 
(see (6.30)). This gives a variant of the Laplace transform-based construction (6.4), but 
with the additional term of sin? LÉ) introduced to force the function to have zeros at 
the correct points ||x|| = v2, V4, V6,.... (The sine factor is squared since we want all 
but one of the zeros to be double zeros; recall Theorem A.29.) Some analysis is now re- 
quired to verify that the idea can lead to a Fourier eigenfunction or indeed that ọ,(x) 
thus defined is even a legitimate function on RÈ. We focus on the properties of A(z) as 
a holomorphic function first before turning to a discussion of @, (x) but keep the substi- 
tution z = ||x||? in mind as you read the next few results. 


Lemma 6.6. The integral in (6.17) converges in the half-plane Re(z) > 2 and defines a 
holomorphic function there. 


Proof. By Lemma 6.5 the integrand in (6.17) (with the parameterization T = it) satisfies 
the asymptotic bounds 


[Uite] z Oe TO) (t eae 00), 
[Uine] E O(t eTe) (t a9 0). 


The constant implicit in the big-O notation does not depend on z. Thus, if we write the 
integral in (6.17) as I,(z) + I,(z), where I,(z) = iF U(r)e™” dt and I,(z) = lie. U(r)e™” dr, 
then, by the standard complex analysis lemma on integrals of a family of holomorphic 
functions with respect to a parameter (Exercise 1.26 on p. 77), the improper integral I, (z) 
converges in the half-plane Re(z) > —2 and defines a holomorphic function there. Simi- 
larly, I,(z) converges and is holomorphic in the half-plane Re(z) > 2. 


Next, we show that A(z) can be continued analytically to the half-plane Re(z) > 0, 
and a bit later, we will show that it can be continued analytically even beyond that half- 
plane. As per the usual convention in complex analysis, we continue to use the same 
notation A(z) to denote all analytic continuations of A(z). 

The formula for the first analytic continuation involves integration over four paths, 
which we denote by ¥_,, Yo, Y4, and Yio,» collectively forming the shape of an inverted 
pitchfork (or an inverted Greek letter Y), as shown in Fig. 6.4. These paths are defined 
as follows: 

— W_, is the circular subarc of the unit circle leading from —1 to i; 
— W, is the circular subarc of the unit circle leading from +1 to i; 
— Wp) is the straight line segment from 0 to i; 

- W,,, is the infinite straight line segment from i to ico. 
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Figure 6.4: The “pitchfork paths” ¥_,, Yo, By, Wi. 


Lemma 6.7. The function A(z) has the alternative expression 


A(z) =-i | U(t+ het? dr -i | U(t- Dez dr 


wy Y, 
42i | U(t)e"” dr -2i | PU(-1/t)e"” dr. (6.18) 
Yo Wico 


Expression (6.18) extends the definition of A(z) to a holomorphic function on the half-plane 
Re(z) > 0. 


Proof. Denote the right-hand side of (6.18) by A(z), and rewrite this function as 
A(z) = -i(A_,(z) + A,(z) - 240 (Z) + 2A;,,.(z)), 


where we set 


A_.@ = | U(r +e" dr, (6.19) 
Ww, 
A,(2) = | U(t —1)e™ dr, (6.20) 


vy 
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Aj(z) = | U(r)e™” dr, (6.21) 
Yo 

Ãiœ(Z) = | TU/e dt. (6.22) 
Y; 


ico 


Now A(Z) is the same as the integral 7, (z) from the proof of Lemma 6.6. It was estab- 
lished in that proof that this integral converges to a holomorphic function in the region 
Re(z) > —2. The convergence of Ajo lZ) to a holomorphic function, also in the region 
Re(z) > —2, follows in a similar manner using (6.11) and (6.14). 

Next, to verify the convergence of the integral A_,(z), we first rewrite it by applying 
a change of variables € = -1/(t + 1). It is easy to check that this maps the contour Y_, 


into the reverse of the straight line segment [-4 + 3i, —1 + ico), so we get the expression 


2 2 


-}+i00 
2 

A_(z) =- | U(-1/é)e MED) a 
-ġ+ii 


Denoting € = -4 + it, where t > 1/2, we have the bounds 
|U(-1/8)| = o(e*”) 
(refer again to (6.11) and (6.14)) and, under the assumption that Re(z) > 0, 


jeD] = exp[7(Re(z) Im(E + 1) + Im(z) Re(E* + 1))] 


2 
: + ino 
2 +1/4 t24+1/4 


= exp|—n Re(z) < exp(z|Im(z))). 


Therefore we conclude that given a compact set K c {Re(z) > O}, there is a constant 
C > 0 such that for all z € K, we have 


-$ +ioo 
| [U(t 4 De] jdt] — | |U(-1/8)| : jee "+ es 
Pi vi ži 
o 
<C | et ac E, 
27T 


This implies the convergence of the integral to a holomorphic function in the half-plane 
Re(z) > 0 by the result of Exercise 1.26. 

The convergence of A,(z) is proved similarly to the case of A_,(z) by making the 
substitution č = —1/(T — 1); details are left to the reader. 
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Having established that A(z) is well-defined and holomorphic on Re(z) > 0, it re- 
mains to check that it extends the definition of A(z). Assume that Re(z) > 2. First, rewrite 
definition (6.17) of A(z) as 


loo 
A(z) = ~i(e™?? = EG | U(r) dt 
0 
loo 
z (a Fy e 7) | U(t)" dr 
0 
ioo ioo ioo 
= -( | Ume de —2 | U(t"? dr + | U (reed ar) 
0 0 0 
1+ico —l+ioco 
2 -( | U(p - 1)e™”* dp + | UE + 1)e™ dë 
1 -1 


a | U(t)" dr -2 | U(t)" a); (6.23) 
Yo Wico 


where in the last step, we make the substitutions p = T + 1, € = T — 1, and for the middle 
integral, decompose the integral over the segment [0, ico) into two integrals over Y) and 
Yio: 
Next, observe that in (6.23), we can transform the integrals over the segments 
[-1,-1 + ico) and [1,1 + ico) by deforming the contours: specifically, the segment 
[-1, -1 + ico) can be deformed into W_, + Yio, and the segment [1,1 + ico) can be de- 
formed into Y4 + Yio. Because of the exponential decay of the integrand as Im(t) — co 
(a fact which follows from the assumption that Re(z) > 0, expression (6.7), and the 
asymptotic estimates (6.14)—(6.16)), an application of Cauchy’s theorem together with an 
easy limiting argument shows that this deformation leaves the values of the respective 
integrals unchanged. The first transformed integral can therefore be rewritten as 


l+ico 
| U(p - 1)e™™ dp = | U(p - 1)e™ dp + | U(p -1)e™ dp, 
1 P, Pio 


and similarly the second transformed integral becomes 


-1+ico 
UE + 1e"? dë = | U(E + 1)e™ dé + | U(E + 1)e™ dé. 
=! Ya Y; 


ico 


Substituting these expressions into (6.23), collecting terms, and then making use of (6.8) 
give 
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A(z) = -i( | U(r —1)e™ dr + | U(r + 1e"? dt 
Efi Y 1 


k | (U(t +1) - 2U(t) + U(t -1))e™” dt -2 | Ule? dr) 
Fio w, 
— -i( | U(t n 1e? dt $ | U(t + feu dt 
P Yi 
+2 | PU (-1/t)e"” dt—2 | U(t)" dr) 
Fio Y, 
= -i(Ay@) + A(Z) + 2Aico (Z) - 2Ag(2)) = AQ), 


as claimed. 


Next, it is useful to derive yet another representation for A(z), which continues it 
analytically to an even larger half-plane. 


Lemma 6.8. The function A(z) is also given by the alternative expression 


ose asne(2)|2( 1 a 


2jJ|az\z-2 2 Z 


+ [Uw - e™ + 2407t - 504)e 7” a| (6.24) 
0 


The right-hand side of (6.24) defines a holomorphic function on the half-plane Re(z) > -2 
(after interpreting its values at the points z = 0 and z = 2 in a suitable limiting sense 
to account for removable singularities at those points) and therefore gives an analytic 
continuation of A(z) to that half-plane. 


Proof. Assume first that Re(z) > 2. Motivated by (6.12), we write 


— _Acin2{ 7 
A(z) = asin'( Z) 


= -tsin (Z) 
2 


(Use 


U(it)e ™” dt 


[(e° — 2407t + 504) 


+ 240nt - 504) |e *” dt 


foe) 
=-4 sin'( =) | fen - 240nt + 504)e "dt 


0 
fo) 


+ | woo —e™" + 240nt - 504)e™ dt |. 
0 
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Evaluating the first of the two integrals in the last expression, we obtain the represen- 
tation 


roe asin'( Z) 1 240 A 504 
2/\|n(z-2) nz nz 


CO 


+ | we - e™ + 240nt — 504)e ™ dt |. (6.25) 
0 


Finally, observe that by (6.12) and the usual appeal to the integration lemma from Exer- 
cise 1.26, (6.25) converges to a holomorphic function in the half-plane Re(z) > —2. 


Lemma 6.9. The function A(z) has the special value 
A(0) = 2407. (6.26) 


Proof. A(0) is the value of A(z) at the removable singularity z = 0 of the expression 
in (6.24). It is easily calculated as 


AO) = lim = sin'( = )( 1 T )) 
z>0\ T 2/\z-2 z Z 


A 2 
= lim( = 52 sin( Z )) Z 240r lim( ACE) = 2407. 
zZ n 2 z=0\ (7Zz/2) 


The next two lemmas establish some useful technical bounds. 


Lemma 6.10. We have the bound 


(oe) 
| et bit dt < ala (6.27) 
0 


foralla,b > 0. 


Proof. Exercise 6.1. 


Lemma 6.11. For any k > 0, there exist constants C}, C, > 0 such that the kth derivative 
A“) (z) of A(z) satisfies the bound 


JAM | < Gye VR (Re(z) > 3). (6.28) 


Proof. Denote a(z) = Ie U(t)e™ dr. Then, for z with Re(z) > 3, we have 


foe) 1 


a”) (z) = ir) | tku(it)e dt = in| i | vaer dt. 
1 


0 0 
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Using Lemma 6.5, we see that there exists a constant C > 0 such that 


CO 
pk+2p-mtRe(2)-2n/t ae 4 | tke-™(Re(z)-2)t ar) 


la”) (z)| < cn'( 
1 


< cn'( 


In the last expression, by (6.27) the first integral is bounded from above by ce 02 VRE@) for 
some constants c4, Cy > 0. It is similarly easy to check that the second integral (including 
the leading multiplicative factor ee” 8°”) is bounded by c,e-” ®*) for some constant 
C3 > 0. Combining these two bounds, we get a bound of the form 


foe) 
et Re(z)—2n/t dt + eT oT Re) | tke @Re(Z)-2)(t-1) a) 


1 


ocg Oo 


la @] < ce (Re(z) > 3) Oem) 


with constants c4, cs; > 0 (possibly depending on k). 
Finally, note that 
k 


d® ( . of mz k\ o d / ofnz 

(sin (Zo) = > (ja (z): Sa (sin (Z) 
K ARNE os dk ( (nz 

<¥ (Jia @l-| CES) 


so the bound (6.29) (or more precisely, the family of bounds indexed by k > 0) also easily 
implies a bound of the form (6.28) for A(z) for any k > 0 with constants C4, C2, which may 
depend on k. 


We now use the function A(z) to define a radial Fourier eigenfunction in RÊ. Define 
the radial function pọ, : R — C by 


9,09) = A(IxI’). (6.30) 
By Lemma 6.7, for x + 0, this can be expressed explicitly as 


p, (x)= -i | U(t+ perii? dr -i | Uit- 1etik? dt 


wy WP, 
+2i | U (rye ar — 23 | PUIT ar, (6.31) 
Yo Figs 


This should be thought of as the “correct” version of (6.3), in which the weight function 
w(s) and the range for the integration are explicitly revealed. 
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Lemma 6.12. @, (x) is a Schwartz function. 


Proof. This follows from Lemma 6.11 and Lemma A.23. The details are left as an exercise 
(Exercise 6.2). 


Lemma 6.13. @,(x) is a (+1)-eigenfunction for the Fourier transform in RÈ. 


Proof: We evaluate the Fourier transform of g, by commuting the transform operator 
Fg with the integrals in (6.31) and applying (6.1) (or rather, the generalized version of this 
relation that applies to complex s; see (A.12)—(A.13) in Section A.7) inside each integral. 
Lety € RÊ \ {0}. Then 


Falo,\(y) = -i | U(t + DAI] ydr i | U(r —1)F[e™h" (yy dr 
Y E Wy 


+2i | UOA le] ydr — 2i | 2U(-A/0) Fale" (yy ar 


Yo Ts 
a | U(r +r e DWE Gr — i | U(r — tyr te VON dr 
wy P 
+2 | Ur te VOW de- 2i | 2U(-1/t)r e AVON dr, (6.32) 
Yo Ye 


Now, in each of the four integrals in the last expression, make the change of variables 
p = -1/t. This change has the effect of permuting the four pitchfork paths Y1, Y4, Yo, 
Wi. according to 


PY, OW, Bo -Yio (6.33) 


(where -Yi refers to Yi with the reverse orientation). Thus the expression in (6.32) 
becomes 


= | u(-* +1)pteroor d ap Í o(-Ż 1 -1)pteroot d dp 
p Po NP P 


1 -1 


-2i | u(- T eter a + 2i fe /py Upper A (eom 
% 


ico 


By (6.9) and (6.10) this is equal to 


-i | Up- ppe" P i | Uip + Nptewor’ AP 
pP p 
vy We 


-2i | p 'u(- je eV dy + 2i | U(pe™" dp = -,(y). 
By 


ico 


250 —— 6 Sphere packing in 8 dimensions 


We proved the equality F,[9,](y) = 9,(y) for all y € RÊ \ {0}. By continuity the claim 
also holds for y = 0. 


Lemma 6.14. The radial profile 0,(r) associated with 9 ,(x) has zeros atr = V2n,n = 
1,2,3,.... The zero atr = V2 is simple, and the zeros at r = V2n,n > 2, are of order 2. 


Proof. By (6.25), A(z) has zeros at z = 2n,n = 1,2,3,..., with the zero at z = 2 being 
simple and the zeros at z = 2n, n > 2 being of order 2. Since @, (r) is related to A(z) via 


P(r) = A(r’), 


the result follows. 


Suggested exercises for Section 6.4. 6.1, 6.2, 6.3, 6.4. 


6.5 The (—1)-Fourier eigenfunction 


Let 0;(T), j = 2,3,4, be the Jacobi thetanull functions, discussed in Subsections 5.13.1 
and 5.14.3. We define 


B(T) + 04(T)f 0T) = e (6.35) 


væ = 128( ae t na 


Lemma 6.15. The function V(t) takes real, nonnegative values on the positive imaginary 
axis. 


Proof. We can see from (5.54)-(5.56) and (6.35) that V(r) is real on the positive imaginary 
axis. For the nonnegativity claim, it is helpful to use the connection of the theta functions 
0-, 03, 8, to the modular lambda function A(t) (see Sections 5.13.2, 5.14.2, and 5.14.3). Using 
identities (5.81), we have that 


Tees OU 6-0) 1 O+036, 1 91-0 
128 03 03 o 63 6 
1/1 1 1-A 1 (1-A)(2+A + 22°) 
= . 1-A)-A) = ; 
at i 03 ® 


Now note that A(it) € (0,1) for t > 0, as is apparent from either the second iden- 
tity in (5.81) or from the infinite product representation (5.75). Since the function x + 

2 
wa is positive for x € (0,1), and since clearly 6,(it)* > 0 for t > 0, from the 


definition we get that V(it) is nonnegative (in fact, positive) for t > 0. 


Lemma 6.16. V(t) satisfies the transformation properties 


1\ 1 1 
v(-2) = aV- V+) = ZVE) - V(r-0), (6.36) 
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1 1 
V -*) = she +1) -2V(z) + V(t - 0), (6.37) 
v( lg 1) = l V(t -1), (6.38) 
T T 
v(-Ż = 1) = Sur +1). (6.39) 
T T 


Proof. From (6.35) and the transformation relations (5.57)-(5.59) satisfied by the func- 
tions 0;(T) we get immediately that 


BO on 03(t)* + ,(t)* OTt + 65(t)* ) 
V(t +1) = V(t-1) 128( T + AGE 5 (6.40) 
2 0,(z)* + OT) OD- a 
T V(-1/T) = 128( AGE + NGE 3 (6.41) 


which, together with (6.35), gives (6.36). Relation (6.37) then follows trivially. We also 
obtain from (6.41) that V(r) = t’V(-1/r) satisfies V(r +1) = —V(z). This in turn im- 
plies (6.38) and (6.39) in a manner analogous to the proof of (6.9) and (6.10) from (6.8) in 
the previous section. 


From now on, we adopt the notation Q = emt = q” ? introduced in Subsection 5.14.2. 
As we can see from (5.54)—(5.56) and (5.78)—(5.80), the functions 64, 63, and o in terms 
of which V(r) is defined are all naturally expressed as power series in the variable Q, so 
this notation is helpful for asymptotic calculations. 


Lemma 6.17. On the positive imaginary axis near t = ico and T = 0, V(t) has the asymp- 


totic behavior 


V(it) = e™ +144 + O(e°™) (t => 00), (6.42) 
V(it) = 102407 +0(e 2") (t > 0). (6.43) 


Proof. By writing out the series expansions for 0;(T) in powers of Q up to low order we 
find that, as T — ioo, 


0,(t)* = 16(Q + 4Q° + 6Q° + 8Q’ + 139”) + 0(Q"), (6.44) 
03(T)* = 1+ 8Q + 24Q* + 320° + 24Q* + 480° + 960° + 0(Q’), (6.45) 
0,(t)* = 1- 8Q + 24Q7 — 320° + 24Q* — 480° + 960° + 0(Q’). (6.46) 


Upon substitution of these relations into (6.35), further mundane algebraic calculations 
give the expansion 


V(t) = a + 144 — 5120Q + 70 524Q° — 626 688Q° + 4265 60004 + O(Q°) (6.47) 
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for V(t). This gives (6.42). A similar calculation using (6.41) gives 


T’V(-1/t) = -2 048(5Q + 6129? + 23598Q°) + 0(Q’), (6.48) 


easily implying (6.43) on setting T = i/t. 


Now by analogy with (6.17) define 


ioo 


B(z) = 4i sin?( Z) | Ve dt (6.49) 
0 


(a contour integral along the positive imaginary line). 


Lemma 6.18. The integral in (6.49) converges absolutely uniformly on compacts and de- 
fines a holomorphic function in the half-plane Re(z) > 2. 


Proof. This follows from (6.42)-(6.43) analogously to the proof of Lemma 6.6. 


We now proceed to perform an analytic continuation of B(z) to the half-plane 
Re(z) > —1 in two steps that are analogous to Lemmas 6.7 and 6.8 from the previous 
section. 


Lemma 6.19. The function B(z) has the alternative expression 


B(z) = -i | V(t + le” dr -i | V(t -1)e™” dr 


wy + 
+2i | Ve"? dr +21 | PVI" dr. (6.50) 
Yo Wiss 


Expression (6.50) analytically continues B(z) to the half-plane Re(z) > 0. 


Proof. This is similar to the proof of Lemma 6.7. As in that proof, denote the right-hand 
side of (6.50) by B(z), which we represent as 


B(z) = -i(B_,(z) + B,(z) - 2By(z) - 2B,,,.(z)), 


where 


B_,(z)= | V(t + Det” dr, 
Ya 

B,(z) = | V(r -1e™ dr, 
WY, 

By(z) = | Vine™” dr, 

Wy 
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Šio (Z) = | PV (-l/t)e™™ de. 
Wi 


ico 


The proof that the integrals converge in the half-plane Re(z) > 0 and are holomorphic 
there is similar to the analogous claim for the integrals (6.19)—(6.22) and is omitted. 

We now check that B(z) coincides with B(z) where the latter is defined. Assume that 
Re(z) > 2. Rewrite definition (6.49) of B(z) as 


loo. 
B(z) = lee = emery | Vine™ dr 
0 
Kee) 
= -i(e" 24-67") | V(t)e" dt 
0 
ico ico ico 
=-i | V(r)? dr + 2i | Vice” dt -i | V(r) ar 
0 0 
1+i00 —It+ioo 
zá | Ve -De dp- i | VE + De”? dé 
1 “4 
+2i | Ve? dr + 2i | Ve"? dr. (6.51) 
Yo Wi 


ico 


Now as in the proof of Lemma 6.7, the reader can check that the straight line contours 
[-1,1, +ico) and [-1, 1, +ico) can be deformed into the concatenated contours Y1 + Yio 
and W, + W;,,, respectively, without changing the values of the respective contour inte- 
grals. Performing this deformation transforms (6.51), after some minor rearrangement 
and regrouping of terms, into the relation 


B(z) = -i( | V(t - le" dr + | Vir + De"? dr 
w, A 
+ | (V(t +1) -2V(t) + V(t- pen dr—2 | vine dr), 
E a, 


whereupon, after making use of (6.37) to simplify the third of the four integrals, we fi- 
nally get that 


B(2) = -i( | V(t —1)e" dr + | V(r +e" dr 
Y Y 1 


-2 | V(-1/t)e"” dr -2 | Viaje” ac) 
Pico Yo 
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= ~i(B,(z) + B_,(z) — 2Bi (Z) — 2By(z)) = B(z), 


as was to be shown. 


Lemma 6.20. The function B(z) is also given by the alternative expression 


B(z) = asin?( = ) | Z ( - 1 z+ =) $ [va -1444-e dt]. (652 
0 


Representation (6.52) analytically continues B(z) to the region Re(z) > —1 (with the obvi- 
ous limiting interpretation at the points z = 0 and z = 2, which are removable singulari- 
ties). 


Proof. Let Re(z) > 2. Starting from (6.49), we write 


B(z) = -4 sin?( =) | Viite ™ at 


0 
= -asin((%)] [eve —144— ehem dt + ie 4 etje T | 
0 


Z asin?( =) Bee a [va -144 - ee 7™ dt |. 
2 az n(z-2) 
0 
Now inspect the last integral to conclude from (6.42)-(6.43) (appealing as before to the 
result of Exercise 1.26) that this integral converges and defines a holomorphic function 
on Re(z) > -1. 


Lemma 6.21. B(z) satisfies 


B(0) = 0. (6.53) 


Proof. Immediate from (6.52). 


Lemma 6.22. For any k > 0, there exist constants C,, C, > 0 such that the kth derivative 
Bz) of A(z) satisfies the bound 


IBP (z)| < Ce YR (Re(z) > 3). 


Proof. Similar to the proof of Lemma 6.11. 


Now let g_ : RÈ — C be the radial function defined by 
9-0) = B(x). 


For x # 0, we can write explicitly 
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g_(x) = =i | repe dr i | V(t - pe gr 


Wii WP, 
+2i | ve ar + 24 | PVD ar, (6.54) 
Yo Tico 


Lemma 6.23. @_(x) is a Schwartz function. 


Proof. Analogous to the proof of Lemma 6.12. 
Lemma 6.24. ọ_(x) is a (-1)-eigenfunction for the Fourier transform in RÈ. 


Proof. This is a calculation similar to the one in the proof of Lemma 6.13. Namely, using 
representation (6.54) and commuting the integrals and Fourier transforms, we have for 
y € RÊ \ {0} that 


Falo_l(y) = -i | V(r + DF _le™" iy) de — i | V(t — Fle" Joy) dz 


wy YW, 
rzi | vorde |) ae+2i | everr] QV) de 
Yo Yio 


=, | V(r + rte N dri | Vit Yr tet Owl dr 
wy A 
+2 | Vit) te VOW dr + 2i | VOUT te AVON dr, (6.55) 


Yo Wico 


Now making the change of variables p = —-1/T as in the proof of Lemma 6.13 and recalling 
that the pitchfork paths get permuted as in (6.33), the expression in (6.55) becomes 


if | v(-2+1)ptewor ap Í v(-2 -1)pteroot d dp 
p PP ae AP P 


1 -1 


-2i | v(-5) eripi? -= 2i fe 1/PV (ppt "PÀ (6.56) 
Yo 


Finally, making use of (6.38)—(6.39) (the analogues of the relations (6.9)—(6.10) that were 


used in the proof of Lemma 6.13) gives an expression, which we easily recognize as being 
equal to —@_(y). 


Lemma 6.25. The radial profile @_(r) associated with p_(x) has zeros atr = V2n,n = 
0,1,2,.... The zero atr = v2 is simple, and the zeros at r = v2n, n = 0,2,3,..., are of 
order 2. 


Proof. It follows from (6.52) that B(z) has simple zeros at z = 0 and z = 2 and double 
zeros at Z = 4,6,8,.... Since @_(r) = B(r’), the result follows. 
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The results of this section and the previous one, taken together, prove Theorem 6.2. 
This gets us most of the way toward an eventual proof of Theorem 6.1. Note that so far 
our analysis has treated the functions @, (x) and @_(x) completely separately from each 
other. To complete the proof of Theorem 6.1, we will need to gain some additional insight 
into how the two functions relate to each other or, going back to the two functions U(r), 
V(t) in terms of which @, (x) and g_(x) were defined, how those two functions in turn 
compare with each other. This is discussed in the next section. 


6.6 Amodular form inequality 


Our goal in this section is to prove the following result. 


Theorem 6.26 (Viazovska’s modular form inequality). The functions U(r) and V(t) satisfy 
the inequality 


U(it) < V(it) (t> 0). (6.57) 


Inequality (6.57) plays a key role in the proof of Theorem 6.1; as we will see in the 
next section, it is needed to establish the fact that our constructed magic function can- 
didate satisfies the nonnegativity condition in Theorem A.21. 

Viazovska’s original proof of Theorem 6.26 in [71] relied on computer calculations. 
The proof presented below, adapted from [58], offers a more direct approach. 


6.6.1 Preparation 


As preparation for the proof, recall the functions U(r) and V(r), which made minor 
appearances in the proofs of Lemmas 6.4 and 6.16. They are given by 


Ep? 
E - E? 
+0; 03-04 ) 


+ 
04 03 


O(c) = PU(-1/r) = 108 


gt 
V(t) = V(-1/1) = 128( 7 


Because of the reciprocal relation between it and i/t = —1/(it), inequality (6.57) is equiv- 
alent to the claim that both the inequalities U(it) < V(it) and -U(it) < -V (it) hold for 
t > 1. As a further simplification, we can clear the denominators in the expressions for 
U(z), V(z), U(r), V(r) by multiplying all four functions by E} -E2 (which can also be writ- 
ten as 77 (02030,)° by (5.63) and (6.5); this function takes positive values on the positive 
imaginary axis). This leads us to defining the functions F, F, G, G by 


F(t) = zE - E2)U(t) = (El) T? + 8E,Eqt + 16E2, (6.58) 
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F(t) = To Ea- EU = (ety, (6.59) 
G(T) = xl + 0,00 VO = 803(03" + 0403 + 0303 — 0,7), (6.60) 
G(r) = l ATING = -803(022 + 0208 + 0208 - 02). (6.6D 


The normalization by a common numerical factor of 1/108 is added to simplify some of 
the formulas. Our goal is now to prove the pair of inequalities 


-F(it) < -G(it) ift 21, (6.62) 
F(it) < G(it) ift>1. (6.63) 


By the above remarks this will be sufficient to imply (6.57). 


6.6.2 Some special values of modular forms 


Our proof of inequalities (6.62)-(6.63) will rely on the numerical values of certain con- 
stants obtained from evaluating various modular forms and related functions at T = i. 
The relevant evaluations are given below. 


Lemma 6.27 (Special values of modular forms at T = i). We have the following identities: 


E,() = tL = 145576, (6.64) 
EÀ) = a le = 2.91152 i, (6.65) 
0,(i) = oa = 0.91357, (6.66) 
03(i) = a D = 1.08643, (6.67) 
0,(i) = i = 0.91357. (6.68) 


In these formulas, T is the Euler gamma function. 


Sketch of proof. For the proof of (6.66)-(6.67), refer to [8, p.325] (which appeals to re- 
sults from Chapter 17 of [7]) or see alternatively [19], where these identities appear as 
equation (2.21) on p. 307. Evaluation (6.66) also implies (6.68) through the observation 
that 6,(i) = 6,(i), a consequence of (5.57). 

Formula (6.64) can now be shown using (6.66)—(6.68) and identity (5.60) from Chap- 
ter 5 expressing E, in terms of the thetanull functions. 

Finally, (6.65) is obtained by combining (6.64) with the results of Exercises 5.14 
and 5.18, recalling the fact (shown in Lemma 5.15) that E,(i) = 0. 
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The evaluations in the lemma are closely related to Gauss’s lemniscate constant 


o=2| x T1/4)? 

J Vi-xt 2V2m 

an important mathematical constant. See [19], [27, Sec. 6.1], [W22], [W23] for more details. 
The proof of (6.62)-(6.63) given below is robust in the sense that it does not de- 

pend on the exact values given in the lemma; the inequalities we are dealing with 

have “slackness,” so we really only need approximate numerical values of the five con- 

stants (6.64)-(6.68). These constants are all expressible as rapidly converging infinite 

series, so, as an alternative to relying on the closed-form evaluations (6.64)-(6.68), we 

can simply calculate the numerical values to a few digits of accuracy using a computer. 


6.6.3 Proof of (6.62) 


We proceed with a proof of inequality (6.62). To develop first a rough sense of why we 
expect such an inequality to hold, at least for large values of t, it helps to look at the 
expansions of the functions involved in powers of the variable Q. Those are given, as we 
can easily check using a computer algebra system, by 


-F(t) = 230 400770" + 8 294.400770° + 113 356 800770° + 831 283 200770" 


+ 4337971200770" +---, (6.69) 
-Ğ(T) = 163 8400° + 16 121 856Q° + 333250 5600” + 3199 4675200" 
+ 194725478400" +.... (6.70) 


When T = it, we have Q = e soa key point to note is that for large t, the dominant 
term in the expansion of —F(it) decays like e *“, whereas the dominant term in the 
expansion of —G(it) decays like e °”, so we will certainly have that —F(it) < —G(it) if t is 
large enough. 

In fact, with a bit of additional reasoning, we can show that the inequality holds for 
all t > 1. First, observe that the coefficients in expansion (6.69) are all nonnegative; this is 
immediate from (5.87) and (6.59). Second, we claim similarly that the coefficients in (6.70) 
are all nonnegative. To see this, note that, by the transformation properties (5.57)—(5.59) 
of the thetanull functions, we can represent G(T) as 


G(T) = p(t +1) - y(t), 
where y(t) is defined by 


y(t) = 80803" + 80703. 
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Now the substitution T +» T +1 corresponds to replacing Q by —Q. Therefore the Q-series 
expansion of —G(z) has all even coefficients equal to 0 and all odd coefficients equal to 
twice the respective coefficients of y(t), which are manifestly nonnegative. This proves 
the nonnegativity claim. 

From the above remarks it now follows that the function t > -Q°F(it) = -e"F (it) 
is a decreasing function of t (since each term in its Q-series expansion is a nonnegative 
coefficient times the decreasing exponential e~"™). This implies that for t > 1, we have 
the bound 


-eF (it) < -eF = -e” (EO) 
or, using (6.65), 


9r(1/4)1 


Smt Fy + 370 
-e F(it) < e ——— 
(it) 1024 7? 


= 105043.78 (t > 1). (6.71) 


On the other hand, by (6.70) and the observation about the nonnegativity of the coeffi- 
cients of —G(z) we have the bound 


- eG (it) > 163 840 (6.72) 


for all t > 0. Combining (6.71) and (6.72) gives (6.62). 


6.6.4 Proof of (6.63) 


As with the proof of (6.62), before tackling inequality (6.63) for the full range t > 1, it 
is helpful to put on our asymptotician hat and first ask the question of why we should 
expect the inequality to hold for large t. The answer is that the asymptotic expansions 
of the functions F(it) and G(it) are given by 


F(it) = 16 + (-3 840 + 7 680)Q” + (230 40077t” — 990 720mt + 990 720)Q* 


+ (8294 4007°ť — 25205 760zt + 16 803 840)Q° +---, (6.73) 
G(it) = 16 + 19200” — 819200? + 107712004 — 8060 928Q° + 41725 4400° 
— 166 625 2800” + 553 054 080Q° — 1599 733 760Q° +---, (6.74) 


where Q = e as before. Here (6.74) is an ordinary Q-series expansion, whereas (6.73) is 
a somewhat nonstandard type of expansion that involves powers of Q = e™, with each 
coefficient being itself a quadratic polynomial in t (refer to (6.58) to understand where 
this structure comes from). 

Now the insight we get from these two expansions is that they share the same con- 
stant term 16 and that both have a next-order term proportional to Q? with coefficients 
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7680-3 840zt and 1920, respectively. Since 7 680-3 8407t < 0 < 1920 fort > 1, again we 
see that for t large, once the lower-order terms have decayed sufficiently, the relation 
F(it) < G(it) will necessarily hold. 

To turn this line of argumentation into a proof of the stronger claim that the inequal- 
ity F(it) < G(it) holds for all t > 1, we need to gain some measure of control over those 
lower-order terms, since for moderately sized t, they are not altogether negligible. This 
requires more subtle reasoning than that used in the proof of (6.62), since in the current 
case, both expansions (6.73) and+ (6.74) involve a mixture of terms with positive and 
negative coefficients. 


Lemma 6.28. Define 
W(t) = 05705 + 030? + 05°03 + 0804. (6.75) 
The coefficients in the Q-series expansion of W are nonnegative. 
Proof. Denote for convenience 
Z=6;, X=06, Y=2Z-X. 


Note that OF = Z-X, by (5.82). Now Z and X have Q-series expansions with nonnegative 
coefficients. Moreover, recalling (5.82), we see that Y can be written as Y = Z + 04 = 
03(t)* + 03(T + 1), which implies that Y also has a Q-series expansion with nonnegative 
coefficients. Therefore by straightforward algebra we get that 


Aa = ZX? + ZX? + Ad CAE OEY a CED Ot 
3 2 
(e (A) x 
2 2 
(ee yeas 
2 2 2 2 


1 
= 75 (8% + 15X*Y + 10X°Y’ + Y’). 


This representation clearly shows that the Q-series expansion of W also has nonnegative 
coefficients. 


Next, it is helpful to renormalize the functions F and G by defining the new functions 


K(t) = s 18 -GEY e -80 Er- (ET); 
Lo) = -AR = -80° [0i(0F + 0405 + 6204 - 6) - 2]. 


Inequality (6.63) can now be restated as the claim that K(it) > L(it) for t > 1. This will 
follow from the combination of the following two lemmas. 
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Lemma 6.29. L(it) < 2297 for allt > 1. 
Lemma 6.30. K(it) > 3747 for all t > 1. 
Proof of Lemma 6.29. The expansion of L(it) in powers of Q is easily written as 


L(it) = -1920 + 819200 — 107712007 + 8 060 9280° — 4172544004 
+ 166 625 280Q° — 553 054 0800° + 1599 733 760Q’ +--- (6.76) 


(compare with (6.74)). Again using the substitution T + T + 1, we also have 


-L(it + 1) = 1920 + 819200 + 107712097 + 8 060 928Q° + 41725 4400" 
+ 166 625 2800° + 553 054 0800° + 1599 733 760Q’ +---. (6.77) 


On the other hand, using the usual properties of this substitution, we have 


-L(t +1) = Q*(G(t +1) — 16) = 80° [0$ (077 + 030% + 0805 + 0) — 2] 
= 80*(W(r) - 2) 
(with W defined in (6.75)). Lemma 6.28 reassures us that the coefficients in expan- 


sion (6.77) are nonnegative, and consequently the coefficients in (6.76) appear with 
alternating signs. Now defining 


H(n) = L(t) = + D 


we see that H(it) has the expansion 
H(it) = 81920Q + 8 060 928Q° + 166 625 280Q° + 1599 733 7600" +--- 


with coefficients that are also nonnegative and majorize those of L(it). Note moreover 
that the constant coefficient in L(it) is -1920, whereas the constant coefficient in H(it) 
is 0. Therefore the bound L(it) < H(it) — 1920 holds for all t > 0. In fact, since H(it) is 
decreasing in t, we get a constant upper bound for L(it) on the interval [1, co), namely 


L(it) < H(i) -1920 (t21). 


To make this bound explicit, we express H(t) directly in terms of thetanull functions. 
Appealing to (5.57)-(5.59) as before, we have 
H(t) = -40° [05 (03° + 0405 + 0504 - 07) -2 
— 03(04" +0303 + 0505 +07) + 2] 
= AQ? (00 + 008 + of - of62) 
= 4Q°(6;(03 - 0%) + 0% (03 + 04)). 
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Therefore, making use of (6.66)—(6.68), H(i) can be calculated as 


20 
H(i) = ae™( pit) (204)? -14 (WE 41) 
20 20 
= ge T49) (8 +4) = 3e” TAD 4216.16. 
(27) 20487! 


We conclude that L(it) < 4217 — 1920 = 2297 for all t > 1, as claimed. 


Proof of Lemma 6.30. The asymptotic expansion for K(it) is 


K(it) = (3 8407t — 7 680) + (-230 4007"t” + 990 720zt — 990 720)Q* 
+ (-8 2944007°ť + 25205 760zt — 16 803 840)Q +---. 


We separate K(it) into three components, defining 


K,(t) = 3840mt + (-230 4007°t” + 990 7207tt — 990 720)Q?, 
K,(t) = QE, (it)’t’ - 16Q-?(E,(it)” — 1) + (230 4007°t + 990 720)Q”, 
K,(t) = -8iQ E, (it)E,(it)t — (3840 + 990 7207tQ’), 


so that we have 
K(it) = K,(t) + K,(t) + K3(t). 
The asymptotic behavior of K,(t) and K3(t) can be understood from the expansions 


K,(t) = -7680 — (8 294 4007"t + 16 803 840)Q* 
— (113 356 80077°t” + 126 819 840)Q° - ---, 
K,(t) = 25205 760ntQ* + 253 639 6807tQ® + 1500 019 2007tQ® +---. 


We now make the following elementary observations: 
1. The function K,(t) is increasing on [1, co). 


Proof. Assume that t > 1. Then 


K! (t) = 3840me °"(e™ + 120n°t* — 636m + 774) 
> 3840me (e7 + 120n7t” — 636mt + 774). 


The last expression is of the form e-*™ times a quadratic polynomial in t, which, as 
it is easy to check, is positive on the real line. Thus we have shown that K; (t) > 0 for 


t > 1, which proves the claim. 


2. The function K,(t) is increasing on [1, co). 


Proof. By inspection the expansion of K,(t) consists of the constant term —7 680 plus 
a sum of lower-order terms, each being of the form —(at?+b)e"™ for some nonnega- 
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tive coefficients a, b and positive integer n. Each such term is an increasing function 
oft fort > 2, so in particular for t > 1. 


3. K,(t) = 0 for allt > 0. 


Proof. The expansion of K3(t) has nonnegative coefficients. 


Combining the above observations, we get that for t > 1, 


K(it) > Kj(t) + K,(t) > K,(1) + Kp (1) 
= -e° (-E}(i)” + 16E,(i)” — 16) + 38407 + 990 7207re *” 
16 16 

- ee 1g 
102472 © 409672 


i 451(1/4)'° 
e 
1024 7 


16) + 38407 + 9907207e 27 


16) + 38407 + 990 720me*" = 3747.1, 


as claimed. 


6.7 Proof of Theorem 6.1 


Define the functions 


C(z) = A(z) + B(z), 
D(z) = A(z) — B(z). 


Lemma 6.31. The functions C(z) and D(z) are holomorphic in the region Re(z) > -2 and 
have the explicit representations 


C(z) = -4 sin'( =) (U(it) + V(it))e"™ dt + (Re(z) > 2), (6.78) 


(U(it) - V(it))e"™ dt (Re(z) > 0), (6.79) 


omg 2g 


D(z) = -4 sin’( =) 


Proof. The holomorphicity is immediate from the analytic continuation of A(z) and B(z) 
discussed in Sections 6.4-6.5. Similarly, relation (6.78) is an immediate consequence of 
Lemmas 6.6 and 6.18. Relation (6.79) follows as well from these lemmas for z satisfying 
Re(z) > 2, but here we make the stronger claim that this representation remains valid in 
the larger half-plane Re(z) > 0; this is related to the fact that in the analytically contin- 
ued representations (6.24) and (6.52), the poles 4 inside the parenthesized expressions 
cancel each other out upon subtracting the two formulas. To make this more precise, 
observe that combining estimates (6.12), (6.13), (6.42), and (6.43) gives that 
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U(it) — V(it) = O(t) (t> 00), (6.80) 
U(it) - Vit) = O(P") (t 0), (6.81) 


and this is clearly sufficient to imply the absolute convergence of the integral in (6.79), 
uniformly on compacts in the half-plane Re(z) > 0. By the principle of analytic continu- 
ation, since the right-hand side of (6.79) is equal to D(z) for Re(z) > 2, it must also equal 
D(z) on Re(z) > 0. 


Define g : RÈ > R by 
(x) = CIXI?) = 9.00) + 9_(X). 


By Lemmas 6.12 and 6.23, g(x) is a radial Schwartz function. By Lemmas 6.13 and 6.24 its 
Fourier transform is 


Felel = 0,00) - 9- = D(IxI?). 


In other words, @,(x) and g_(x) are the Fourier-even and Fourier-odd components in 
the Fourier parity decomposition of @; see (A.20)-(A.21). 


Theorem 6.32. The function g is a magic function for the lattice Ex. Consequently, 
Aoptimal (8) = a and the E; sphere packing is optimal. 


Proof. Let py = V2. In R®, we have 


4 né 


TT 
VoI(Boy/2(9)) = aT = 3a 


which is precisely the packing density of Eg (see Theorem A.8). Therefore we need to 
show that ọ satisfies the three conditions of Theorem A.21 with the particular value of p 
being equal to 2. Indeed, by (6.26) and (6.53) we have 


(0) = 9,(0) + p_(0) = 2407 > 0, 
(0) = 9,(0) - p, (0) = 2407, 


so the first condition is satisfied. Next, (6.78), when combined with Lemmas 6.3 and 6.15, 
implies that g(x) < 0 for all x € RÊ with ||x|| > v2. This confirms the third condition. 
Finally, (6.79), together with inequality (6.57), implies that F,[@] is everywhere nonnega- 
tive. This is the second condition of Theorem A.21 and the final one needed to be verified. 
The proof that g is a magic function for Eg and therefore that the E, sphere packing is 
optimal for sphere packing in 8 dimensions is complete. 


Suggested exercises for Section 6.7. 6.5, 6.6, 6.7. 
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Exercises for Chapter 6 


6.1 


6.2 


6.3 


6.4 


6.5 


6.6 


6.7 


Prove Lemma 6.10. 

Prove Lemma 6.12. 

Explain why the operation of commuting the Fourier transform with the integrals 
in (6.32) is justified. 

Show that the function A(z), which was analytically continued to a holomorphic 
function on the half-plane Re(z) > —2 in Lemma 6.8, can in fact be continued ana- 
lytically to an entire function. 

Find the special values 6( V2), (@)'(-V2), (V2), (@)' (V2) associated with the radial 
profile @ of the E, magic function 9, the radial profile of the Fourier transform of 
@, and their derivatives. 

Prove that the magic function @(x) satisfies the following properties: 

(a) fes (x) dx = 2407. 

b) Èran) — 1) = 0. 

(c) Èxer, P(X +y) = 2407 for all y € R. 

Magic function for the Leech lattice. [16] Prove that there exists a magic function 
for the Leech lattice in dimension 24. 

Guidance. Repeat the proof of this chapter with appropriate modifications. The 
function U (T) should be replaced by 


Ly (t)t* + L4 (T)T + Lo (T) 
(E3 - £2)? 


Uy4(T) = 6912 x 


with Up, Ly, H defined by 


U(7) = 36(25E¢ — 49E3), 

(T) = 6rti(48EgEs + 2(25Es — 49E?)E,), 

Hy(T) = n° (25E} — A9EZE, + 48EgEzE, + (25E¢ — 49E:)E%). 
In place of V (T), use 


70268 + 76765 + 2028 
(E3 - E2) 


Vo4(T) = 12° x 


See Exercise A.16 in the Appendix for the relevant properties of the Leech lattice. 


A Appendix: Background on sphere packings 


This appendix presents the background material on sphere packings and related notions 
that is necessary to understand the developments of Chapter 6. The material discussed 
here mostly does not involve any complex analysis (with the one notable exception be- 
ing the proof of Proposition A.17 in Section A.7). Before reading this appendix, we rec- 
ommend reading Sections 6.1 and 6.2 for motivation. 


A.1 Sphere packings and their densities 


Fix a dimension d > 2. Given r > 0 and x € RÊ, denote by B,.(x) the Euclidean ball of 
radius r centered at x. Asphere packing in R? consists of a union of balls of equal radii 
with nonoverlapping interiors. We commonly denote a packing as 


P = P(X,r) = |) B-09, 


xeX 


where X c Rf is the set of centers of the balls participating in the union, and r is their 
common radius. The upper packing density associated with a sphere packing P is 


Roo vol(Bp(0)) 


In the case where the limsup in (A.1) is in fact an ordinary limit, we say that P has a 
packing density. In that case, we denote A} by Ap and refer to this quantity simply as the 
packing density of P. 

The optimal packing density of R? is defined to be 


Aoptimal(@) = sup{Ap : P is a sphere packing in Rt. 


A sphere packing P in RÊ is called optimal if it has a packing density and its packing 
density is equal to Aoptimaı (d). 


Theorem A.1 ([35], [36, Sec. 3.viii]). An optimal sphere packing in RÊ exists. 


Sphere packings have a trivial scale invariance property: replacing all the balls 
B,(x) in a sphere packing P by their scaled copies B,,(Ax) for some constant A > 0 re- 
sults in a sphere packing with the same packing density. For this reason, when proving 
facts about packing densities for general sphere packings, we can assume without loss 
of generality that a packing has some specific common sphere radius r (where r can be 
chosen arbitrarily for some reason of convenience). 


Suggested exercises for Section A.1. A.1. 
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A.2 Lattices and lattice packings 


A lattice in Rf is a set of points of the form 


d 
A= {dna :Ny,...,Ng ez}, 
j=1 


where xX... Xg is a linear basis for RÊ. Another notation for the same set is Ph Z. 
Xj. (This may be referred to as the Z-span of the vectors x4, ..., Xq. The spanning set 
X1,...,Xq is said to be a basis for the lattice A; note that it is not unique.) Given a lattice, 
it is easy to check that the associated union of balls P(A, r) is asphere packing if and only 
ifr < r,(A), where 


r, (A) := min] 


n 
2 nx 
j=1 


: (hs... N4) ezi (0.0) 


We refer to the sphere packing P(A, r, (A)) as the lattice sphere packing (or lattice pack- 
ing) associated with the lattice A and denote its packing density by 6). 

It is not known whether in every dimension d there exists a lattice A whose associ- 
ated sphere packing is optimal. This is the case in the dimensions d = 2, 3, 8, 24, which are 
the only dimensions for which the value of Agptimai(@) has been established rigorously. 


A.3 Periodic sphere packings 


Lattice sphere packings are a particular case of a more general family of sphere pack- 
ings called periodic sphere packings. These are packings that have a periodic structure 
associated with a lattice. More precisely, let A be a lattice in RÍ, let A = {Xis . -> Xm} C RÊ 
be a finite set of points, and let r > 0 be a number. Assume that ||x + x; - x,|| > 2r for all 
1< j,k <m and all x € A, except for the case x = 0 andj = k. Then the union of balls of 
radius r centered around A-translates of the points of A is a sphere packing; that is, we 
define 


P=P(X,r), where X=A+A={xj+y:1sjsm ye A}. (A.2) 


A sphere packing constructed in such a way is called a periodic sphere packing (or 
periodic packing). 

It is not known whether in every dimension d there exists a periodic sphere pack- 
ing that is optimal. However, periodic packings are sufficiently general that they come 
arbitrarily close to being optimal, as the following result makes precise. 


Lemma A.2 ([14, Appendix AJ). 
Aoptimal(@) = sup{Ap : P is a periodic sphere packing in Rt. 
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A.4 Lattice covolume 


The covolume of a lattice A = Dh Z: Xj, denoted covol(A), is defined as the absolute 
value of the determinant of the matrix containing the vectors x;,...,Xq as its rows. 


Lemma A.3. The definition of covol(A) is independent of the choice of basis x,,...,Xq for 
the lattice. Moreover, the covolume has the following geometric interpretation: it is the 
volume of the set 


{tX + bX +--+ + taXq š to- ta € [0,1]} 


(called the fundamental cell, or fundamental parallelepiped, of the lattice associated 
with the basis x,,...,Xq). 


Proof. Exercise A.2. 


Lemma A.4. 1. Fora lattice A c RÊ, the packing density of the associated lattice sphere 
packing is given by 


_ vol(Bay(0)) nr, (ay! (A3) 
AT covoa(A ~ r(4 +1) covol(A)’ ) 
2. Fora periodic sphere packing P as in (A.2), its packing density is 
d/2,„d 
= mvol(B,(0)) _ mor (A.4) 


covol(A) — T(4 + 1) covol(A) ` 


(In (A.3)-(A.4), T denotes the Euler gamma function.) 


Proof. The second equality in each of relations (A.3) and (A.4) follows from the well- 
known formula for the volume of the unit ball in RÊ; see Exercise 2.3 on page 110. The 
proof of the additional claim relating the explicit quantities in (A.3) and (A.4) to the pack- 
ing densities 5, and Ap is left as an exercise (Exercise A.3). 


Suggested exercises for Section A.4. A.2, A.3. 


A.5 Dual lattices 


If Aisa lattice in RÊ, then its dual lattice is the set denoted A* and defined by 
A* = {y € RÊ : (wy) € Zforallx € A}. 


The fact that A” is a lattice follows from Lemma A.5. 
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Lemma A.5. IfB = {x;,...,X,} is a basis for the lattice A, let B* = {y,,...,yq} be the dual 
basis, when considering B as a linear basis for R4; that is, the vectors y4, ..., Yq are the 
unique vectors satisfying 


(Xj Ye) = Sj (1<j,k <d) 


(where Six denotes the Kronecker delta). Then we have A* = Or Z - Yj. 


Proof. Exercise A.4. 


Suggested exercises for Section A.5. A.4, A.5. 


A.6 The Poisson summation formula for lattices 


In Chapter 2, we discussed the Poisson summation formula for functions of a single real 
variable (Theorem 2.6), a classical result from Fourier analysis, in the context of our 
proof of the functional equation of the Riemann zeta function. There is a version of the 
same result for functions on R? involving summation over lattices. This result relates the 
summation of values of a nicely behaved function on R? over a lattice to the summation 
of its Fourier transform over the dual lattice and plays an important role in the study of 
sphere packings. 

To state the result, first recall some basic facts about Fourier transforms in d dimen- 
sions. The Fourier transform in R¢ is the operator F; taking a function f : RÊ > C to 
the function F,[f] given by 


Filf ly) = | foo exp(-2mi(y,x)) dx, (A5) 


Ri 


assuming appropriate integrability conditions. We also denote the Fourier transform of 
f by f. The Fourier transform acts in a particularly nice way on Schwartz functions. 
A function f : RÎ > Ris called a Schwartz function if it satisfies 


j, 20y 


sp pG- k 
OXx,'0X," + “Ox 


X=(Xq)...Xg)ER? 


1X9 d < 00 


for any integers j,,...,jqg,k,,...,Kq = 0. The following is a standard fact from analysis; 
see [41, p. 301] for the proof. 


Proposition A.6. The Fourier transform of a Schwartz function is also a Schwartz func- 
tion. 


We can now state the Poisson summation formula for lattices. 
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Theorem A.7 (Poisson summation formula for lattices). Let A c RÊ be a lattice, and let 
f : RÊ > C be a Schwartz function. Then 


S S 


df as covol(A) 


xeA 


> fo. (A.6) 


yeA* 


Another, slightly more general, version of the Poisson summation formula for lat- 
tices is 


ee 
covol(A) 


Y fy) exp(2nity,t)) (t RÔ). (A.7) 


yeA* 


Yie+O = 


xeA 


In fact, equations (A.6) and (A.7) are equivalent, since (A.6) is the case t = 0 of (A.7), 
and conversely, the general case of (A.7) for arbitrary t € R is immediately obtained 
from (A.6) on applying that relation to the function g(x) = f(x + t). 


Proof of Theorem A.7. Exercise A.6. 


Suggested exercises for Section A.6. A.6. 


A.7 Construction of the lattice E, 


The goal of this section is to construct the lattice Eg, which plays a central role in the 
sphere packing story. We will prove the following result. 


Theorem A.8. There exists a lattice in RÈ, denoted Eg, with the following properties: 
A 

1. The packing density ôr, of the sphere packing associated with Eg is ae 

2. The set of Euclidean norms of points of the lattice Ez is 


{v2n : n=0,1,2,...}. 


An immediate corollary of the existence of E is the following conceptually impor- 
tant result. 


Corollary A.9. The optimal sphere packing density Aoptimai(8) in 8 dimensions satisfies 


né 


Several different constructions of E, are known; perhaps its most natural manifes- 
tation is as the lattice spanned by the E, root system, an object associated with the Eg 
Lie algebra, one of the so-called exceptional Lie algebras that appears in a famous clas- 
sification theorem. [40, p.238] Here we give an elementary construction of Eg, which 
provides a straightforward path to a proof of our claims (while offering little insight 
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into what makes E, so special and interesting). Define the vectors x,,...,X, € RÈ as the 
columns of the matrix 


ot. D OB. vos 
Of E= 0o 0o oł 
Oc ia oo oS 
O OY Oo" ke A, 20h, Oy 2 

hig. co Or po. Sk Oe 
Or 1 03.30% 0. at 
0 0 0 0 0 i 
0 0 0 0 0 0 i 


and define 


8 
Eg = DZ-%. 
j=1 


Lemma A.10. F; is a lattice with basis x,,...,Xg, and its covolume is 1. 


Proof. The x; are clearly linearly independent, so Eg is indeed a lattice in RE, and 
covol(E,) = det(M) = 1. 


Lemma A.11. The lattice E; has an alternative representation as 


8 
E; = | ezi: vy; =0 moa 9} 
jel 


8 8 
1 
uono «(2+5) yj 20 maa}: (A.8) 


Proof. Denote the two sets participating in the union on the right-hand side of (A.8) by 
Ig and Jg, respectively. By inspection, x),...,X7 € Ig, Xg € Jg, and Ig U Jg is closed under 
the taking of linear combinations with integer coefficients. This shows that Eg < Ig U Jg- 
Conversely, if y = (y1,...,¥g) € Ig, then we can write 


8 
y= » OX; 
j=1 


(regarding y for convenience as a column vector), where 
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a TEE S. 
a, 0111111 -6 yz 
a 0011111 -5 Ys 
a | yy |00011114 ya 
as 0000111 -3 Ys 
ds 000001 1 -2 Ye 
a 0000001 1 yy 
ag 0000000 2 Ys 


Again by inspection, the assumption that y; are integers satisfying Èj- yj = 0 (mod 2) 
immediately implies that a,,..., ag obtained in this way are themselves integers and that 
therefore y = Da a;x; € Eg. This shows that Ig c Eg. To show that also Jg c Eg, observe 
that if y € Jg, then y — xg € Ig, so the previous calculation shows that y = Xg + Yj AjX;, 


where a; are integer coefficients, and thus once again we have that y € Eg. 


Lemma A.12. For any x,y € Eg, we have (x,y) € Z. 


Proof. For 1 < j,k < 8, define tjk = (xj,X;); explicitly, the numbers (ik are the 
entries of the symmetric matrix 


4 -2 0 0 0 0 0 1 

-2 2 -1 0 0 0 0 0O 

0 -1 2 -1 0 0 0 0 

MTM = 0 0 -1 2 1 0 0 0 
0 0 0 -1 2 -1 0 0 

0 0 0 0 -1 2 -1 0 

0 0 0 0 0 -1 2 0 

1 0 0 0 0 0 2 


Now if x,y € E, then express x, y as x = ae ajx; and y = Ss b,X;, with integer 
coefficients a;, b. Then 


8 
(x,y) = $ ti AD, (A.9) 
kel 


which is manifestly an integer. 
Lemma A.13. For any x € Eg, we have Ix? € 2Z. 


Proof. This is immediate from (A.9) on setting y = x and noting that the double sum can 
be rewritten as 


8 8 
= nat 2 n . 
$ tex = $ {jj +2 > tj kija, 
j,k=1 j=1 1<j<k<8 


which is easily recognized as an even integer. 
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Lemma A.14. The packing density of the sphere packing associated with Eg is a 


Proof. Since the squared norms ||x||? for x € Eg are nonnegative even integers, the 
minimal norm of a nonzero vector is at least V2. On the other hand, the vector x = 
(1,1, 0, 0, 0,0, 0,0) = x, + x is one specific vector in Eg with that norm, so v2 is in fact 
precisely the minimal nonzero norm. This establishes that 


r, (Eg) = a 


Now using (A.3) together with the already established fact that covol(E,) = 1 gives the 
claim. 


Lemma A.15. E; = Es. 


Proof. Lemma A.12 can be reformulated as the statement that Eg ¢ Eg. To prove the 
reverse inclusion, let y;,...,yg € RÊ denote the elements of the dual basis to x,,..., Xg. 
These are simply the rows of M™* (or if they are thought of as column vectors, then the 
columns of (M~')"). Now observe the somewhat trivial matrix equation 


(M-*)" = M(M(M~')") 


14 24 20 16 12 8 4 -7 
24 42 35 28 21 14 7 -12 
20 35 30 24 18 12 6 -10 
-M 16 28 24 20 15 10 5 -8 
2 2 18 155 12 8 4 -6 
8 4 12 10 8 6 3 —4 
4 7 6 5 4 3 2 -2 
7 -12 -10 -8 -6 -4 -2 4 


For each 1 < j < 8, the jth column y; of the matrix (M -1)T can be expressed as a lin- 
ear combination of x4, .. .,Xg with coefficients taken from the jth column of the matrix 
MM" (e. g., y4 = 14x, + 24x, + 20x; + 16X4 + 12X5 + 8X6 + 4X7 — 7Xg). These coefficients 
are all integers, and thus y; € Eg. Since y4, ...,Yg are a basis for E; (see Lemma A.5), we 
have shown that Eg ¢ Eg. This completes the proof that E, = Eg 


Our last remaining task for this section is to prove the second claim in Theorem A.8. 
We already showed that all the squared norms of E; lattice vectors are even integers; it 
remains to show that all positive even integers are in fact squared norms of Eg vectors. 
This will follow from a much more precise statement. Define the numbers (a,)72. by 


A, = #{x € Eg : Ixl? = 2n}. 


Note that, trivially, ay = 1. 
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Lemma A.16. For some constant C > 0, we have the bound a, < Cnt foralln>1. 


Proof. Exercise A.7. 


Proposition A.17. We have the relation 
an = 24003(n) 


(where o3(n) is defined in (5.2)) for all n > 1. 


Remarkably, this result, which has a distinct number-theoretic flavor, can be proved 
using a complex-analytic argument involving modular forms. The idea is to form a kind 
of generating function for the squared norms of E, lattice vectors (known in the theory 
of lattices as the theta series of the lattice) and study its complex-analytic properties. 
More precisely, define a function of a complex variable t by 


n(t) = 5 eritixl z= y ine. (A.10) 


XEEg 


Lemma A.18. The infinite series (A.10) converges absolutely and uniformly on compacts 
on the upper half-plane H and defines a holomorphic function there. 


Proof. By Lemma A.16, 


Y jewel | <1+ È Cnte A: 


XEEg 


which converges uniformly in any half-plane of the form {t : Im(T) > k} where x > 0 
and a fortiori on any compact subset of H. The holomorphy follows from the standard 
theory (Theorem 1.39). 


Lemma A.19. The function n(t) is a modular form of weight 4. 


Proof. The equation n(t + 1) = n(t) is immediate from (A.10), i. e., n(T) transforms cor- 
rectly under the generator T of the modular group T. We need to show that n(T) also 
transforms in the correct way under the generator S, that is, that n(T) satisfies the equa- 
tion 


n(-1/t) = tfn (T). (A.11) 


By Lemma 5.21 that would imply that n(z) is a modular form of weight 4. 
To prove (A.11), define the function f, : R — C depending on a parameter t € H 


by 


foo = eth, (A.12) 
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In the case where 7 lies on the positive imaginary axis, i.e., T = it with t > 0, this is 
2 

an 8-dimensional scaled Gaussian e~™“*! , which transforms under the (8-dimensional) 

Fourier transform as 


FO) =O faye) = Pte OD (A13) 


(For general T € H, this equation still holds, but if you are feeling queasy about this or 
cannot be bothered to check it, just assume that 7 is on the positive imaginary axis for 
now.) Applying the Poisson summation formula (A.6) and keeping in mind Lemma A.15 
give 


Y feo = ¥ A) (A.14) 


X€Eg yEEg 


This is precisely what we need, since the left-hand side of (A.14) is equal to n(T), and, 
by (A.13), the right-hand side is equal to T *n(-1/t). Thus we have established (A.11). 
(As a final step, if you previously assumed that T is imaginary, then now appeal to the 
principle of analytic continuation to argue that since the equation (A.11) holds on the 
positive imaginary axis, it must hold on all of H.) 


Lemma A.20. We have the identity 
W(t) = Eq(t) (T€ H), (A.15) 


where E, denotes the normalized version of the Eisenstein series G, defined in (5.87). 


Proof. By Theorem 5.24 the vector space M, of modular forms of weight 4 is one- 
dimensional and contains n(t) and E4(T). Thus we have 


n=1 n=0 


fee) i 00 
1+) a,c" = KE,(t) =K- (: +240% oun") 


Equating the Oth Fourier coefficients on both sides gives K = 1, proving the claim. 


Proof of Proposition A.17. This follows immediately from (A.15), again by comparing the 
Fourier coefficients on both sides. 


Suggested exercises for Section A.7. A.7, A.8. 


A.8 The Cohn-Elkies sphere packing bounds 


Theorem A.21 (Cohn-Elkies sphere packing bounds [14]). Let f : RÊ > R be a Schwartz 
function, and let p > 0 be a number. Assume that the following conditions are satisfied: 
1. f(0)=f(0) > 0; 

2. The Fourier transform f is real-valued, and f(y) > 0 for ally € RÊ; 
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3. f(x) < 0 for all x € RÊ such that |x|] > p. 
Then Agptimai(@), the optimal packing density in R, satisfies 


np? 
Aoptimaı (d) < vol(Bp/2(0)) = 2ar(4 +1)’ (A.16) 


Proof. By Lemma A.2 it suffices to prove that vol(B,/2(0)) is an upper bound for the 
packing density of any periodic sphere packing with common sphere radius p/2 (see the 
remark about scale invariance in Section A.1). Let P be such a packing, defined in terms 
of a lattice A and a finite set {x,,...,X,,} as in (A.2). Recall that the fact that the common 
radius of the spheres in the packing is p/2 means that the Euclidean norm |[x + x; — xxl 
for any 1 < j,k < mand lattice point x € A is either 0 (in the case x = 0 andj = k) or is 
otherwise bounded from below by p. 

Let 1 < j,k < m. Applying the Poisson summation formula (A.7) with t = x;— x; gives 


Y FOC +x) — xy) = 


XEA 


ane, 2 fO) exp(2mti(y, xj — Xx). (A.17) 


Summing this relation over all j, k further gives that 


`. Y fe +45 - Xe) 


fete 
_ ai > dJo exp(2mi(y, x; — x,)) 
-1 5 a 1a exp(27i (y, x) )exp(2mti(y, xXx)) 


2 


a) - 


(A.18) 


F i, 2 f old 


The first and last expressions in this chain of relations are manifestly real numbers, 
and we will reach our desired conclusion by upper-bounding the former and lower- 
bounding the latter. Specifically, we have that 


=f(0) ifx =Oandj=k, 
<0 otherwise, 


f(x +xj-Xx) is l 
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by our observation above about ||x + Xj — Xxl combined with the third condition in the 
theorem about f. Thus the leftmost expression in (A.18) is bounded from above by mf (0). 
On the other hand, by the second condition f satisfies, the rightmost expression in (A.18) 
can only be made smaller by discarding all terms y € A \ {0}. Thus the expression is 
bounded from below by ay if (0) =z all pf (0). Combining these two bounds yields 
the inequality 


covol(A) > m. 


This is exactly what we need, since the packing density then satisfies 


7 mvol(B,/2(0)) 


ip = < vol(B,/2(0)), 


covolA 
as the inequality in (A.16) claims. (The second, more explicit expression in (A.16) for the 
upper bound follows from the well-known formula for the volume of the unit ball in 
RÎ; see Exercise 2.3 on page 110.) 


A.9 Magic functions 


Given a lattice A c RÊ with packing density 6 a a Schwartz function f : RÊ = Ris called 
a magic function for A if it satisfies the assumptions of Theorem A.21 with the particular 
value of p for which 


vol(Bp/2(0)) = Ôp. 


By Theorem A.21, if we were to prove the existence of a magic function for some specific 
lattice A, that would imply that Agptimai(@) = 6,, and that the lattice packing associated 
with A is optimal for sphere packing in IR, thereby resolving the sphere packing prob- 
lem in dimension d. 

Magic functions are a tool that seems almost too powerful (or “magic,” hence the 
name) to exist. Indeed, heavy numerical experimentation done by Cohn and Elkies 
suggested that in most low dimensions they do not; but in a few special dimensions, 
the numerical evidence suggested that they do exist, leading to the following conjec- 
ture. 


Conjecture A.22 (Cohn-Elkies [14]). Magic functions exist for the following dimensions 
and lattices: 

1. d= 2: the hexagonal lattice (Z - (1,0)) @(Z- (4, £) in R?; 

2. d= 8: the lattice Eg; 

3. d= 24: the Leech lattice (described in [18, Sec. 5.11]). 
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Viazovska [71] proved the second of these conjectures by finding an explicit con- 
struction of a magic function for the lattice E,; her proof, using complex analysis and 
modular forms, is the subject of Chapter 6. A subsequent construction using similar 
methods of a magic function for the Leech lattice in the case of 24 dimensions, prov- 
ing the third conjecture, was given by Cohn et al. [16]. The first conjecture regarding the 
existence of a magic function for the hexagonal lattice in IR? remains open (as of 2023). 


A.10 Radial functions and their Fourier transforms 


A function : RÎ > Ris called radially symmetric, or a radial function, iff (x) depends 
only on the radial coordinate of x, that is, if f(x) = f(y) whenever ||x|| = |lyll. Clearly, f(x) 
is radial if and only if it can be represented as 


fœ =F (IIxll) 


for some function f : [0, co) > R. The function f(r) is determined uniquely, as f (r) is the 
unique value that f(x) takes on the sphere {x : ||x|| = r}. We refer to f(r) as the radial 
profile of f(x).! 

Iff : RÊ > R is a general—not necessarily radially symmetric—function, then we 
can apply a standard analytic trick to f(x) to obtain a radial function to perform radial 
symmetrization, that is, to average out the function over concentric spherical shells of 
equal radius around 0. More precisely, we define 


faa) = > | F(lety)d04.40). 
gti 

the integral over the unit sphere S*? = {y € RÊ : |lyl| = 1} with respect to its surface 

area measure og, normalized to be a weighted average by dividing by the total sphere 

surface area Sq-1 = Gas? . We call faa (X) the radially symmetrized version of f(x). 

Note that f is radial if and only if it coincides with its radially symmetrized version. 


Lemma A.23. Let f : RÊ > R be a radial function. Then f(x) is a Schwartz function if 
and only if the radial profile f (r) satisfies the following properties: 

1. f(r) is the restriction to [0, co) of an infinitely differentiable even function on R. 

2. r"f™(r) ~z 0 for alln,m > 0. 


Proof. Exercise A.9. 


1 Some authors commit the mild abuse of not making a clear distinction between a function and its 
radial profile, for example, by referring to them interchangeably and denoting both of them with the 
same symbol. 
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Lemma A.24. The radially symmetrized function f,,4(x) has the following alternative ex- 
pression: 


1 
faa) = Zeo | 14% dv4(A). (A19) 


The meanings of the symbols in this formula are as follows: SO(d) is the special orthogonal 
group of order d, that is, the group of d x d orthogonal matrices with determinant 1; and 
vq is the Haar measure on SO(d), that is, the unique (up to scalar multiplication) Borel 
measure on SO(d) that is invariant under the group action, i. e., satisfies v4(A -E) = vq(E) 
for allA € SO(d) and all Borel sets E c SO(d) (with A - E denoting the set of matrices 
{AB : Be E}). 


Proof. Exercise A.11. 


Lemma A.25. Iff : Rê > Ris a Schwartz function, then faa) = (f) rai that is, the Fourier 
transform of the radial symmetrization of f is equal to the radial symmetrization of the 
Fourier transform of f. 


Proof. If Ais a d x d orthogonal matrix and g : RÊ — R, then denote by g, the function 
g “rotated by the transformation A”, that is, 


ga) =g(Ax) (x eR), 


It is trivial to check that Ea) = (g), (the Fourier transform commutes with orthogonal 
transformations). Now using (A.19) (applied to both f andf), it follows that 


Fea) = Fy [x E | rawana 


o(d) 


1 
v0) | hoo avalo) 
sold) 


o1 (y) 
va(SOC@)) , ” 


Fa|x > 


1 = 1 F 
= ECN | (f4)(y) dvq(A) = Vq(SO(d)) | (fay) dvg(A) 
sod) sold) 
1 À a 
= 50) | Fay ava = Area 
sola) 


From Lemma A.25 it follows in particular that the Fourier transform of a radial 
Schwartz function f : RÊ > R is also a radial function. Because of this, when discussing 
radial functions, it is helpful to think of the Fourier transform in d dimensions as an 
operator acting directly on the associated radial profile. That is, if f : R > R has an 
associated radial profile f(r), and g(y) = F4[f 1y) denotes the Fourier transform of f (x) 
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with associated radial profile g(p), then we refer to g(p) as the radial Fourier trans- 
form of f(r) and denote this as 


ZP) = FE Fp). 


See Exercise A.13 at the end of this Appendix for more details (which are interesting but 
not needed for our purposes) on how this transform can be described more explicitly 
and some of its properties. 

Radial Schwartz functions have a decomposition into “even” and “odd” parts with 
respect to the taking of radial Fourier transforms. This is explained in the following 
lemma. 


Lemma A.26. Let f : RÊ? — R be a radial Schwartz function. Then f has a unique repre- 
sentation of the form 


F=f, +f- (A.20) 
where f, f_: RÊ > Rare radial Schwartz functions with 
Falf+] =f, Falf-] = fa 


that is, f, are eigenfunctions of the Fourier transform with associated eigenvalues +1 and 
—1, respectively. The Fourier transform of f is then given by 


Falf] =f, -f (A.21) 
and f, f_ are given by 
F, - F, 
penai, E: Fall 


Proof. Exercise A.12. 


We call (A.20) the Fourier parity decomposition for radial Schwartz functions. We 
call f, the Fourier-even part of f and call f_ the Fourier-odd part of f. 


Suggested exercises for Section A.10. A.9, A.10, A.11, A.12, A.13. 


A.11 Structural properties of E, magic functions 


Theorem A.21 provides a powerful technique for proving upper bounds on the optimal 
packing density of R. This was used in [14] to prove improved numerical upper bounds 
for Aoptimai(@). Even more intriguingly, it raises the natural question of how we can go 
about using the theorem to try to derive a sharp upper bound in any given dimension, 
or at least one that is best possible using the method. Needless to say, this is a highly 
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nontrivial question. The difficulty lies in the fact that we are trying to optimize the bound 
(the quantity on the right-hand side of (A.16)) over a rather peculiar-looking space of 
functions. Without any further clues as to what sort of properties an optimizing function 
f might have, this is tantamount to groping in the dark. 

Fortunately, in the case of 8 and 24 dimensions, Cohn and Elkies pointed out that we 
can infer some interesting structural properties of a hypothetical optimizer by using the 
additional (conjectured, at that point) knowledge that in those dimensions, the optimiz- 
ers are magic functions for the Eg and Leech lattices, respectively. Let us see what those 
structural properties are. We focus here on the case of 8 dimensions, where these prop- 
erties turned out to be the crucial clues that ultimately led Viazovska to her construction 
of an E magic function. 

First, we can strip away one apparent layer of complexity from the optimization 
problem by noting that although the class of functions f we are optimizing over consists 
of functions on RÊ (that is, functions of d real variables), there is no real loss of generality 
in assuming that the function in question is a radial function—a huge simplification, 
since radial functions are described in terms of their radial profile, which is a function 
of a single real variable. The idea is made precise in the following lemma. 


Lemma A.27. Iff : RÊ > R is a function satisfying the conditions of Theorem A.21 with 
parameter p, then there exists a radial Schwartz function g : RÊ — R that satisfies the 
same conditions with the same value of p. 


Proof: Take g = fraq, the radially symmetrized version of f, which is also a Schwartz 
function (Exercise A.10). Using Lemma A.25, it is easy to check that g satisfies the same 
conditions that f satisfied, with the same value of p. 


A second important observation concerns a necessary condition a function must 
satisfy to be a magic function. 


Lemma A.28. Iff : RÊ — Ris a Schwartz function that is a magic function for a lattice 
Ac RÊ then it must satisfy 

fœ) =0 forallxeA\{0} and 

fœ) =0 forallx € A* \ {0}. 


Proof. First, note that we can assume without loss of generality that A has the property 
that 


r,(A) = p/2. 


Indeed, if A does not satisfy this, then we can replace it by a scaled version aA of itself 
with a > 0 chosen so as to cause this equation to be satisfied; the scaling does not change 
the value of ô}, so f would still be a magic function for the rescaled lattice. 
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Now combining the Poisson summation formula (A.6) with the assumptions on f, 
we have that 


fO=fO+ Y fwo=>) fe 


XEA\{0} xeA 


E TA 2 fo) = T O 2 70) 


yeA* yeA*\{0} 


fo — fO 
~ covol(A)  covol(A)` (A.22) 


This is equivalent to saying that covol(A) > 1, which in turn is equivalent (refer to (A.3)) 
to the relation 


Ôn < vol(B,. (4) (0)). 


Since we assumed that f was a magic function, 6, is also equal to vol(B, j2(0)), soa final 
equivalent reformulation of the inequality between the leftmost and rightmost terms 
in (A.22) is the statement that p/2 < r,,(A). However, we started the proof by assuming 
that p/2 is equal to r,,(A). This means that both (weak) inequalities in (A.22) must actually 
hold as equalities. The only way in which this can be true is if all the summation terms 
that were discarded to obtain those inequalities—the terms f(x) for x € A \ {0} in the 
first inequality, which were known to be nonpositive, and the terms f(y) for y € A \ {0} 
in the first inequality, which were known to be nonnegative—are necessarily 0; this was 
exactly the claim to prove. 


Combining the above results and specializing to the case of Eg, we easily obtain the 
following result. 


Theorem A.29 (Necessary condition for Eg magic). Let f : RÈ > R be a radial Schwartz 
function, and let f, and f_ denote the Fourier-even and Fourier-odd parts of f as in (A.20). 
Define the functions ©, 6,®,,®_ : (0,00) > R by 


a(r) = f(r) (the radial profile of f), 
G(r) = F locr) (the radial profile of f), 
®,(r) =f,(r) = ae (the radial profile of f.,), 


®_(r) =f,(r) = (the radial profile of f_). 


(r) — G(r) 
2 


Iff is a magic function for E,, then the following conditions hold: 
1. (0) = (0) > 0; 
2. @(r), Or), ® (r), and ®_(r) have zeros at the points r = v2n forn =1,2,3,.... 
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3. (r) does not change signs atr = V2n for n = 2,3,... (so its zeros there are of even 
order, assuming that it is real-analytic so that the order of the zeros is well-defined). 

4. (r) does not change signs atr = V2n for n = 1,2,3,... (so its zeros there are of even 
order, assuming that it is real-analytic). 

5. If ®(r) does not have zeros in (0, V2), then it changes signs at r = V2, so its zero there 
is of odd order, assuming that it is real-analytic. 


Proof. Exercise A.14. 


Suggested exercises for Section A.11. A.14, A.15. 


A.12 Summary 


In this appendix, we have developed a solid framework for the study of the sphere pack- 
ing problem in d dimensions, with a focus on the case of d = 8, from the point of view 
of the connections of the problem to harmonic analysis. The main tool is an analytic re- 
sult, Theorem A.21, which, along with related observations such as Lemma A.27, reduces 
the problem to a purely analytic question: namely, can a radial function be constructed 
with certain special properties involving simultaneous conditions on the function and 
its Fourier transform? 

An additional tool of importance is Theorem A.29. This result plays a motivational 
role in helping us think about the sphere packing problem in 8 dimensions, as it nar- 
rows down considerably the class of functions that we need to consider as hypothetical 
magic function candidates. Specifically, the theorem suggests that to find an Eg magic 
function, we should look for a function (r) of a single (radial) real variable that has 
the property that both (r) and its radial Fourier transform F¥* [0] have zeros at the 
points r = V2, V4, V6,.... This is a rather idiosyncratic problem quite unlike anything 
else mathematicians had ever seen before, and its solution eluded the researchers think- 
ing about the problem until Viazovska came up with her breakthrough solution in 2016. 
Conceptually, what makes the problem hard is that it is difficult to control the zeros of 
a function and its Fourier transform simultaneously: it is straightforward to construct 
functions with a given set of zeros and functions whose Fourier transform has a given 
set of zeros, but no standard tools or ideas in (pre-2016) harmonic analysis offer much of 
a clue for how to do both of those things at the same time, or indeed give much insight 
into whether it can be done at all. 

One of the conditions in Theorem A.29 offers a possible way out of this conundrum: 
specifically, the point of considering separately the components ®, and ®_ in the Fourier 
parity decomposition of ® is that each of those components is an eigenfunction of the 
radial Fourier transform, and thus, if we can force it to have the required set of zeros, 
then its Fourier transform will automatically have those zeros as well. So the problem 
is reduced to constructing radial Fourier eigenfunctions that have zeros (with certain 
constraints on their orders) at V2n,n = 1,2,.... Of course, the condition of being a 
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Fourier eigenfunction is not a trivial one to satisfy either, especially when combined 
with the constraints on the zeros, so it is not a priori clear that this observation makes 
the problem anymore tractable; it seems conceivable that we have merely traded one 
difficult-to-satisfy condition for another. 

Nonetheless, constructing Fourier eigenfunctions with the correct set of zeros turns 
out to be precisely the right approach. This was the path taken successfully in Via- 
zovska’s solution of the sphere packing problem in 8 dimensions; for the details, read 
Chapter 6, which you now have the necessary background to tackle. 


Suggested exercises for Section A.12. A.16, A.17. 
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Exercises for Appendix A 


Al 


Ad 
A3 
AA 
AS 
A.6 


A7 
A8 


A.9 
A.10 


A.11 
A.12 
A.13 


Is an optimal sphere packing in RÊ unique? Why, or why not? If not, then what 
can be said about the extent of the nonuniqueness? 

Prove Lemma A.3. 

Prove Lemma A.4. 

Prove Lemma A.5. 

Prove that for any lattice A in RË, covol(A*) = covol(A)t. 

Prove Theorem A.7. One possible proof proceeds in two steps: first, prove the re- 
sult for the specific lattice A = Z“ by deducing it from the original Poisson sum- 
mation formula for functions on R. Second, derive the result in full generality by 
starting with the formula for Zt and applying a linear coordinate change. 

For a more direct approach, see [15, Appendix A]. 

Prove Lemma A.16. 

Another construction of the lattice Eg (discussed, for example, in [12]) starts by 
postulating the existence of a basis x4, .. . ,Xg € RÊ whose Gram matrix (the matrix 
of inner products (Xj Xx) takes the form 


2-100 00 0 0 
42-100 0 0 0 

0-1 2-1-1 0 0 0 

8 0 01 2 0 © 0 0 
(Xi) kat = 0 0-1 0 2-1 0 0 
0000-1 2-1 0 

0 00 00-1 2 -1 

0 00 000-1 2 


Prove that such a basis exists and try to redevelop the results of Section A.7 based 

on this construction. 

Prove Lemma A.23. (See also [32, Sec. 3], [17, Subsec. 2.3].) 

Prove that the radially symmetrized version of a Schwartz function is a Schwartz 

function. 

Prove Lemma A.24. 

Prove Lemma A.26. 

Radial Fourier transforms in RÊ. (31, Sec. B.5], [45, Secs. 4.20, 4.23] 

(a) Letf : RÊ — R be a radial function with a well-defined Fourier transform. 
Denote F(r) = f(r) and G(p) = f(p) (the radial profiles of f and f, respec- 
tively). Prove that F and G are related to each other by 


2 foe} 
G0) = am | POr anar) dr, (A.23) 
0 


A.14 


(b) 


(c) 


(d) 


(e) 
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on CO 
Pr) = a | coe” Jaj-1(27pr) dp. (A.24) 
0 


Here we use the notation J,(z) for the Bessel function of the first kind of 
index a, an entire function defined by 


E fee) (-1)" Z 
Jal) Sanne) 


(see also Exercise 1.16 on p. 74). The integral transform that associates a func- 
tion G on [0, oo) with another function F on [0, c0) according to (A.23) is 
known as the Hankel transform. 

Prove that if f : RÎ —> Risa radial square-integrable function that is an 
eigenfunction of the Fourier transform, that is, F,(f) = Af, then A = 1or 
A=-1. 

Let a > 0. Define the sequence of polynomials (Lf.(x))-°y by the formula 


an+a 


L°(x) = 


n 


n ¢_4yk 
( 2 4 + A 
£ kK! \n-k 
The polynomials L%(x) are called the Laguerre polynomials with parame- 
ter a. Prove that the polynomials L*(x) satisfy the orthogonality relation 


T(n+a+1) 


zl Omn (m,n = 0). 


| L OLK Coe x dx = 
0 


Here mn denotes the Kronecker delta. 
Let d > 1. Define the radial functions yt) = GA (IxI) on RÊ, n > 0, by 


Gtr) = £4 27r?) 


ye ee 


Prove that ys is an eigenfunction of the Fourier transform with eigen- 

value (-1)". 

Prove that the sequence VALo forms an orthogonal basis of the subspace 
L2 a(RÎ) of L7(R%) consisting of radial functions. (In other words, together 

with the previous claim, this shows that the sequence (Oe; diagonalizes the 

restriction of the d-dimensional Fourier transform to the radial functions.) 


Prove Theorem A.29. 
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A.15 How close are the conditions listed in Theorem A.29 to being sufficient for the 
function f to be a magic function for Eg? That is, what additional mild assumptions 
on &(r) and (r) would guarantee that f is a magic function? 

A.16 The Leech lattice. Prove the following analogue of Theorem A.8: 


Theorem A.30 ([18, pp. 131-134]). There exists a lattice in R™, denoted Ly, and 
known as the Leech lattice, with the following properties: 


. 2 i P ? 4 12 
(a) The packing density of the sphere packing associated with Ly, is ae 
(b) The set of Euclidean norms of points of Ly, is 


{V2k : k =0,2,3,4,...}. 
(c) The numbers (by) p29 defined by 
by = #{x € Log : IXI? = 2n} 
are given explicitly by 


p,- 65520 


n= gg (eu) - T(n)) (n21. 


For the definitions of o, (n) and t(n), see (5.2) and (5.28). 


A.17 Formulate an analogue of Theorem A.29 for the case of a magic function for the 
Leech lattice in 24 dimensions. 
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